I'm filming a video tomorrow about all of the dumbest things people have said about UploadThing. Reply with some good ones here and you might get featured ;)
I've been using uploadthing for a long time now. I know how a S3 bucket works but honestly I got screwed up on handling the permissions of S3 initially. Uploadthing is faster, smoother to configure & clean in it's operations. I hope uploadthing becomes a norm for all the businesses. It's really good. Wishing good luck to Theo and Julias.
"Would rather use the much more stable and simpler Amazon S3, and does speed even matter? The user should be fine waiting a few more seconds." - Some guy in discord
I mean even if you didn't try to do that consciously, it would happen - you don't write everything perfectly the first time, specially when working on an MVP. It has to work before it can be optimized
@@bastianventura dude, it's not a nuclear fusion equation analysing app! it's a freakin S3 uploader! you could have it up and running in 1 prompt! but i guarantee you 90% of its code is to limit your ability to upload based on your tier... you should 1. PLAN 2. CODE 3. Optimize, i guessed he missed 1.
the web dev world is slowly reverting. soon we will get "we used literally zero npm packages and just vanilla JS, and our product shipped 10x faster, and the average API response time is 0.0001ms"
Curiously recently I discovered a way with vanilla navigation api and view transition to make an app like nextjs, with all features, faster and don't need build step
while the tooling has a few npm packages for sure Astro is great for that you can ship zero JS if you want, proper grid layout with a few lines of CSS (as much as I love tailwind it adds pages and pages of CSS), and the (optional) SSR features like Astro Actions are specifically designed to work without JS.
1) S3 does support resumability 2) File sizes can be checked using `content-length-range` 3) S3 can reject on file extension and mime types 3) You could have ditched Lambda and done a webhook back to the server
I can make it even simpler by removing your server and just by using free and open source Uppy to upload directly to S3/R2 or wherever, it has resumability and other plugins for free too
I think the difference is that Uppy's "ingest server" can be run by you (using Tus) or by Transloadit. You still need a server if you wish to have resumability and no ghost files, though
More and more, I'm coming around to the idea that all these microservices, serverless, edge networks, etc. create way more complexity than is needed for the vast majority of use cases. We devs do love to complicate things.
1 to 2 years and we will have completed a full cycle. "new" web devs already "discovering" PHP again. Not long before people uploading HTML files to a nginx/Apache server again and calling it "zero dependency websites". This will be the new big thing.
One day we will figure out how to cut out the middleman entirely and upload straight to our own servers, which can then transcode files, upload them to S3, etc. Oh wait, we actually had that figured out in 2005...
Let us know about the costs difference later down the line, because serverless tends to be very expensive, but your new infrastructure uses a lot more bandwidth on the application side
I had a debate with this bloke on his discord 2 years ago where he couldn’t fathom that I refused to use serverless for a next app due to serverless constraints and performance issues I was having. Good to see he’s finally coming around 🎉
I know the joke is that "this meeting could be an email" But i feel UT could just be a blog describing best practices for S3 and or a config file. What is the product offered?
one year later. "we made file reading 10x faster and lowered our cloud cost 10x by going bare metal server. " it i always nice to see people gets excited when they re invent the wheel
Wow this is massive reduction in complexity! I hope though one day we'll have technology advanced enough to use this thing called "Your server" to store a file. Sure hope we would be able to achieve even less arrows on the graph then...
@@aaronevans7713 I suppose you'd only use the AWS S3 SDK in the back-end server anyway and send pre-signed URLs to the front-end, right? Otherwise you'd have to push some form of credentials to the front end. Honest question: What's the issue with a 3.2 MB (uncompressed JavaScript) client in the back-end?
This puts a limit on the bandwidth available as you are proxying the file uploads to s3, if you have a ton of concurrent uploads you will also need to scale your own servers.
Not if the "Ingress Server" is on AWS EC2. Instead of paying S3 traffic coming from internet, they are paying S3 traffic from inside AWS (Which may be even cheaper). Incoming traffic to the server from internet is free (Well, let's say included on the per-hour price)
@@framegrace1 I am not talking about pricing, but about bandwidth, S3 has distributed endpoints for content delivery and you can have 100s of people upload simulateniously at high mbps, on the other hand your one ec2 instance is limited to whatever mbps amazon has to it, and if you try to upload 4-5 big files at the same time (from different users with good bandwidth) it will bottleneck it for everyone
@@framegrace1 And what is your point exactly? What i said is that to handle more concurrent users they will need to scale the number of instances they run. Then they need to use load balancing to distribute the content across the ec2 instances. And what is more you lose the advantages of the distributed infrastructure of s3 that amazon has built.
Legitimate question, but isn’t 1.5s to upload 3.2MB still really slow? I don’t know what kind of internet you have, but a 50mbps upload would’ve sent the data in 500ms, what is taking the extra second?
@@ramonsouza9846 well, 11:15 is the *upgraded* version of the app... Sure it's faster, but I wouldn't call a turtle amazing when compared to a snail if it could have the speed of a rabbit...
Did you do any load/performance tests for your UT Ingest Server? Would be really nice to have a video just on that :) Also scaling of this server is an interesting topic...
This can be simply solved by client notifying the server once the file upload done. This is just over engineering at its finest. His reasoning was there will be ghost files if the client didn't notify the server. Solution to that is client always upload to a temp location and move the file to actual location when client notified the file has been uploaded. And you setup a s3 lifecycle to delete files based on the update date.
does it allow for resume and improve the time for smaller uploads? Still, they made their changes and going back to S3 isn't feasible for their marketing too. Plus they now support other types of "buckets" so I guess it isn't just S3 being inefficiently use, instead it gives them marketing leverage to be more independent and agnostic
@@t3dotgg But YT manages their object storage :) I’m genuinely surprised there’s a market for what your company is offering-it’s something an above-average developer could probably knock out in a day as part of their sprint. That said, it takes real business savvy to identify a need and turn it into a viable product with customers. No criticism of your product at all-it’s more of an eye-opener for those of us in tech about how smart business moves can make all the difference.
@@hemanthaugust7217true, really amazing the javacripts guys can complicated everything, any reasonable back end dev finish that a single day with a lib
uploading should really just be a single chunked transfer http request with a single response. the server can easily athenticate that and save the partial data to get resumability, and more
Thanks @dsherc, this is insane. Just side note: this could be a bit misleading: I see that you mentioned we can’t check file size of filed being uploaded. But even if we can’t dynamically check file size while uploading, we can limit the max file size via adding the size cap to the pre-signed post. Essentially this is what i did: on upload requests we ask for the file size, we return the presigned url with file size added to the signature as cap.
I know this will sound smug and I am sorry, but: 100% faster should be 0 seconds. So 377% faster and 509% faster mentioned at 3:10 makes no sense, what do those numbers mean? How did you calculate them?
I believe he meant something like: 100% = two times as performant -> final time x/2 377% = three dot 77 times as performant -> final time x/3.77 If it took 5 seconds and now I takes me 1 second I would say my thingy is doing 500% better, because I can do one thingy five times in the time the old thingy took to do one
Impressive. But wouldn't it be even faster, if we remove some more requests and roll the "Your Server", "UT Ingest Server" and "S3" components into a single thing managing uploaded files? Something that kind of works as a common base for data?
@@danhorus oh right, I got the impression it was clients own authentication and direct upload to S3. I obviously don't understand what this solution provides.
@@dancarter5595 An easier way to upload things? They also add some code to the process so you don't need to do it yourself. I mean it's like using Vercel so you don't have to set your infra.
Yeah, that's the thing here that sort of defeats using it for anything production that is user-data sensitive. In EU at least, cause the us ofc doesn't care for user data. Because you are going to be in breach of GDPR. Since you are the administrator of the data, you cannot share it with 3rd parties without consent.
I think you are ignoring a huge security loophole in your logic. If the browser gets the presigned URL, then they can just use it directly without having to go through your ingest server, thus ending up with ghost files anyways
Fwiw, Lambdas are not the only way to have serveless compute in AWS. ECS Fargate also offers the benefits of serverless (scale to zero, pay for what you use, etc) without the limitations of Lambda.
@@kevboutin everything is ok in the right context. Personally I tend to shy away from anything that names itself something that it clearly isn't. There are always servers...
@@m12652 so the name of something is your problem? The name is not a problem for me if it solves problems and increases productivity for less money. Priorities always vary I suppose. 🤷♀
So the upload/forward from the UT ingest server to the "S3" is now not validated. Which means if the connection between those two fails for some reason at any point, you get invalid results. That is a huge cost. In theory even if you had validation, the ingest server would need to store the files until the actual upload/forward completes. Yes, even if you practically pipe the upload directly between two sockets. Additionally you keep connections alive (from the file upload to the ingest server) while waiting for the response of the external server. That's not good. If these servers take longer than expected to respond, your ingest server may stack a bunch of inactive sockets which it keeps open for no other reason than waiting. You essentially now have an external bottleneck for your hosted server, costing you resources. Also as you said by yourself. The difference is much higher with smaller files which just means that your overhead from different requests got reduced. Because of course less requests means, less added latency. The percentages are kind of misleading. You would actually need a graph that shows difference depending on file size.
I love it. I had implemented the same structure you had in the past and I was planing on creating an ingest to propagate super similar to your architecture. That’s a great validation of concept. I would love to use your project but I run all in GRPC to traffic the data.
The bring your own bucket is really important. We have contracts at work that specify we have to story customer data in Australia, so if we can't control where it's stored, we can't use the service.
Well that is nothing surprising, everyone should know that each serverless our cloud computation application always has an overhead. It is like saying, the new built file upload in rust is 10x faster than in javascript lol
I'd love to see Theo work on some Remix projects. Remix offers a great deal of built-in type safety, eliminating the need for extra implementation effort.
Did you account for filesystem caching before/after in your demo? I assume so, given the breadth of architecture changes you described. But caching recently used files in RAM (as modern operating systems tend to do) can make a very noticeable difference in responsiveness. Especially if the files come from spinning disks, network storage, RAID with parity, SATA SSDs... Pretty much anything but NVME. Any kind of A/B performance testing where caching is a possibility requires either pre-caching all inputs (run it several times until the numbers look stable) or somehow guaranteeing that the inputs will never be cached. I'm sure you already knew that. But people often forget that it applies to their own demos.
If theo makes this fully free (100% self hosted for everyone) I will be very happy It would be no longer a service tho But could offer premium capabilities for companies
Who is this for? Why am I able to just do uploads to s3 in all my apps with aws apis/sdks without 3rd party packages to help, let alone a 3rd party saas service? Honest question. I just don't get why this exists or would be popular beyond maybe a brand new dev following a tutorial where s3 is just out of scope... I'm either too dumb to see where the value is or too smart to depend on a saas to do what the aws sdks do for free.
> Says "Honest question" then immediately shits on the people using it by calling them "brand new devs" Assuming this is actually honest, maybe check out my other videos about UploadThing and S3? tl;dr - if you think S3 is easy to set up, your implementation is FULL of security issues and probably offers a bad user experience too Most real companies with object storage have built their own UploadThing-like solution, but we're a generic that anyone can use at any scale :)
Even these pitches at the end "now you can bring your own bucket!" and "now you can run our server directly in your infra!" seem baffling to me. We already have our own buckets and our own infrastructure simply by using s3 directly. How are those selling points of introducing a saas between us and s3? Again, honest question. I have never felt more out of touch, and can't tell if thats a good or a bad thing hahaha.
This is just meant to be educational to show what things can be slow and how to resolve them for unexperienced developers who haven't reached or considered these steps on their journey.
@@macchiato_1881 That was my 2nd guess but I was seeing a lot of comments in bad faith so I really couldn't tell without any tone indicators lol. In a way my reply speaks to them too
@@sanjaux why do you need tone indicators? People like you need to handle negative comments better. I get not all criticism is good. But are you just going to whine at every valid negative criticism or joke you get?
@@macchiato_1881 Well the actual jokes no I'd ignore those, but criticism is best resolved through talking it out. Since this isn't criticism, more signs would have helped differentiate your joke from something actually worth discussing. Handle them better? I'm just trying to understand the thought process behind some comments (the serious ones)
Great success! It's also quite cute that, even after so many live-streams and videos that you have done, you end up sounding a bit like a school kid presenting their project the first-time in front of the class, when you are talking about something that you are really proud of.
this is similar to investment, companies need to say something different, since most of them aren't innovative, instead they just go back and forth between things we have done in the past so people will invest in them.
This just in. Serverless proven to be a buzzword to keep you purchasing overpriced subscription model technology. In other news, paint is wet when applied.
This statement is probably coming from someone who has never built any applications professionally using serverless solutions. It's a paradigm shift and one many people haven't wrapped their heads around yet. People fear what they do not understand and despise things that require LOTS of real world work to become proficient in.
Huge improvements! It's great that you feel ready to tackle enterprises, but I can assure you - it's not easy, not at all. Data privacy standards are more looked at than ever, so I'd first go for SOC, HIPAA and EU variants of those to have certificates you can shield yourself against quick-shot enterprise questions :)
As someone who works with AWS for 3rd party security reviews, those enterprise features sound nice. Still, there’s a LOT of config settings that AWS requires (that are not always cheap and is constantly changing) to be meet the bar. Still, this is very cool infra design change and breakdown. I really appreciate this, folks who don’t work with AWS/cloud don’t understand.
I remember the whole serverless is designed to be short time, lightweight, infrequent requests for particular functionalities of your application. Hence the server doesn't need to run all the time and save your cost, and you don't need to maintaining the server. Lately, it was abused massively for all kinds of heavy tasks, which should belong to your own server. And people complain the serverless. The comment section is full of "devs" who say serverless is bad or host your own server is bad. Joke about the web dev, without understanding of those subtle details. The current generation has huge skill issues imo.
@@doc8527 Yeah, I get to witness some real nutball spaghetti lambda design. If you need to mange over 50+ lambdas for your backend plus have one for every single API, troubleshooting & DevOps becomes a nightmare. Gotta watch every lambda metric, have so many cloudwatch logs etc. Thats where Vercel like companies do serverless a little better, they're taking on more of that burden, but its priiicy! I'd pay for it in a heart beat to save me time though. Docker containers are where its at. Fargate/ECS that thing. Even EC2 management has improved a lot with CDK + SSM scripts.
hang on, are you basically getting the files from the clients now? Will you have the same bandwidth as the direct S3? Will you pay for the ingress traffic?
I think theo should ditch this project. A lot of theo fans can gather the pitch fork at me , but just read other comments (ftp , you can directly do this in s3 , the point of uploadthing was that the data wasn't passing through their server , uppy exists yeah , don't use uploadthing
When we trigger S3 uploads/copies through various means, rather than having our API state update the front end we allow our client to hit a headObject presigned url to assert that the object has successfully landed. Requires some ugly polling but it’s cheap polling
Yes, frontend life is better when I can literally write everything in a single HTML file. Vanilla JavaScript + sometimes Web Components for the win. I'm a Helix IDE user so keep that in mind since that may also influence why I don't mind making web apps in that way.
I'm doing a beginner's web dev course that has a file storage project. I ran into the latency issue with this architecture on day one. Originally I tried: 1. Client sends upload request to my server. 2. Server requests signed URL from Supabase. 3. Supabase responds with URL. 4. Server sends URL to client. 5. Client uploads and notifies server when it's done. 6. Server updates db and sends success response. I can't center a div but I could tell this was horrifically slow! I noticed immediately and switched to streaming through my server to Supabase which was 2-3x faster for small files.
@@theairaccumulator7144It's marketing. Just like the cloud and hypervisor. It sounds more cool than using someone else's server or dividing your computer's resources.
@@theairaccumulator7144 Because the developer doesn't need to worry about infrastructure (patching, load balancing, scaling, monitoring solutions). Imagine your services grow exponentially and your bare metal servers or virtual compute costs shoot through the roof because you cannot stay ahead of the usage spikes. If you know what you are doing with serverless, you will easily solve all of these problems/challenges.
Really cool stuff! Now that the infra is more flexible, something I'd love to see in UploadThing in the future is Cloudinary-like image transformations. UT would become a viable Cloudinary competitor with that! Could also be part of PicThing if you plan on doing more with it than just background removal :)
Now it's faster - but it also cost more money - you need to run server, you need to pay for the bandwidth and so on. So it's a trade off - you will pay more for your infra - you will get better user experience. It's the same as with Auth, you can use 3rd party auth system, which saves you ton of work but you can't control the user experience to the very details.
Infra matters more than what frontend/client could ever achieve. Because on frontend you can only show the loader nothing else because client has limited internet bandwidth.
But what did you use to build your ingest server?!?! typescript? .NET? Go? Rust? Something else??? I wanna know the details about your serverFULL architecture!!! There's no details in your blog post either about what you used to build your ingest server in, how it's hosted, etc. I'm extremely interested in what you landed on for those tech choices.
I wonder how pricing would work with "bring your own bucket". But we're very excited for it since our organisation has rules on what geolocation a bucket can exist in. And even just using local infrastructure.
@@Itsneil17 you know that this is like saying "just make you WordPress"? I guess Upload Thing is simpler, but getting right is really hard. That's why we use abstractions that hide the real complexity
It's kind of sad that resumable file transfer is a big feature now, because I remember it being a standard thing when I was a kid. It was lost somewhere along the way, and I'm glad to see someone is paying attention.
I see the upload to the bucket from the client browser goes through te ingest server and forwards to the bucket hosting server. here is an idea for custom file scanning/checking: can there be a future where a a website can host their own "approval server" that receives a connection from the ingest server, and "listens in" on the file as it is being uploaded to the bucket server and gives a go/no back to the ingest server? it doesn't seem like it slows down the upload (as it is being scanned as it is uploaded), takes barely any time to get the green light, and if it gets rejected the ingest server just tells the bucket server to discard the upload and returns an error to the client browser. with how fast "just forward the packet" seems to be, it is mostly up to the approval server to respond quick enough. headers are always at the start and are the most checked thing to scan on, so by the time the file uploaded the headers has been processed and a green light has been given to ingest. Just an idea. let me know what you think.
I'm filming a video tomorrow about all of the dumbest things people have said about UploadThing. Reply with some good ones here and you might get featured ;)
"Typical case of things developers care about, but the customers dont"
- some twitter user
@@martinlesko1521 that’s the one that inspired the video :’) The security one was too good as well
I've been using uploadthing for a long time now. I know how a S3 bucket works but honestly I got screwed up on handling the permissions of S3 initially. Uploadthing is faster, smoother to configure & clean in it's operations. I hope uploadthing becomes a norm for all the businesses. It's really good. Wishing good luck to Theo and Julias.
"Would rather use the much more stable and simpler Amazon S3, and does speed even matter? The user should be fine waiting a few more seconds." - Some guy in discord
"I mean, just self host. ¯\_(ツ)_/¯" - Another random discord guy
Rule No.2 when you make an App: Make it slow so that when you remove the slow logic in the code, you can brag about how fast it became.
What is rule No. 1?
I mean even if you didn't try to do that consciously, it would happen - you don't write everything perfectly the first time, specially when working on an MVP. It has to work before it can be optimized
😂
@@bastianventuraexactly, premature optimization is the death of projects. Make it work, then make it fast
@@bastianventura dude, it's not a nuclear fusion equation analysing app! it's a freakin S3 uploader! you could have it up and running in 1 prompt! but i guarantee you 90% of its code is to limit your ability to upload based on your tier... you should 1. PLAN 2. CODE 3. Optimize, i guessed he missed 1.
the web dev world is slowly reverting. soon we will get "we used literally zero npm packages and just vanilla JS, and our product shipped 10x faster, and the average API response time is 0.0001ms"
bro i'm writing on paper.
and I am here moving from vanilla JS into npm land..
who wouldve known that less is more
Curiously recently I discovered a way with vanilla navigation api and view transition to make an app like nextjs, with all features, faster and don't need build step
while the tooling has a few npm packages for sure Astro is great for that you can ship zero JS if you want, proper grid layout with a few lines of CSS (as much as I love tailwind it adds pages and pages of CSS), and the (optional) SSR features like Astro Actions are specifically designed to work without JS.
1) S3 does support resumability
2) File sizes can be checked using `content-length-range`
3) S3 can reject on file extension and mime types
3) You could have ditched Lambda and done a webhook back to the server
Truth! I was shaking my head during so much of this.
very very true
Lmao expecting proper knowledge from Theo is stupid. He's a TH-cam influenza
2024 is the year of serverlesslessness
Wouldn't it be a serverfulness?
bro left vercel and realised serverless is better
And serverless was never actually serverless
Or serverfulness
Ran out of VC money 😂
Theo finally discovered servers. Massive win
I can make it even simpler by removing your server and just by using free and open source Uppy to upload directly to S3/R2 or wherever, it has resumability and other plugins for free too
I think the difference is that Uppy's "ingest server" can be run by you (using Tus) or by Transloadit. You still need a server if you wish to have resumability and no ghost files, though
@@danhorus Interesting
More and more, I'm coming around to the idea that all these microservices, serverless, edge networks, etc. create way more complexity than is needed for the vast majority of use cases. We devs do love to complicate things.
but then deploying everything yourself isnt a great idea either
@@martinlesko1521 why not?
We are just learning. We want to make things better, so we try something new. Then the flaws show up and we adapt.
I've been saying that for years! Every major outage too its basically always one of DNS or _microservices_
Resume Driven Development.
Doesn't help AWS (in particular) sell you their shit even if it's worse for you.
1 to 2 years and we will have completed a full cycle. "new" web devs already "discovering" PHP again. Not long before people uploading HTML files to a nginx/Apache server again and calling it "zero dependency websites". This will be the new big thing.
One day we will figure out how to cut out the middleman entirely and upload straight to our own servers, which can then transcode files, upload them to S3, etc. Oh wait, we actually had that figured out in 2005...
I used to use TUS in C# and it was a pain in the ass, I ended up writing my own upload client and server code and the code was 10x simpler...
why would we do something faster and more logical when we can do something easy and new? Logic left the room long time ago
the other day Theo was working on Laravel, now he's going back to servers, tech really is evolving backwards
The old ways are still best.
@brainiti I don't know about best, but it helps that the old ways were resource constrained so we know how to makes things well while being lean
S3 has resumability. You must tweak a bunch of config and code to do it. But it works.
Wait until he figures out how quick and simple FTP is...
Wait until pfqniet realizes that this is built for people with actual users...
@@t3dotgg well, in many cases FTP was enough for enterprises, so... :D
@@d3stinYwOw it still is 😢 (sftp will NEVER die)
@@t3dotgg just steer clear of bank tech and you'll never have to find out
@@t3dotgg lol you sound so goofy when you reply to people like this
Let us know about the costs difference later down the line, because serverless tends to be very expensive, but your new infrastructure uses a lot more bandwidth on the application side
I had a debate with this bloke on his discord 2 years ago where he couldn’t fathom that I refused to use serverless for a next app due to serverless constraints and performance issues I was having. Good to see he’s finally coming around 🎉
next step would dont use a SaaS and setup S3 on your own
I thought the selling point was that with upload thing your data never passes through it.
it's incredible how such nice things happen when Vercel turns off the taps 😉
Why I need a service for this in the first place?
It turns out you really really don't. In fact it's probably dirtier and bad practice to use this.
Bro has 7 major versions in a year
We follow semver :)
@@t3dotgg7 breaking changes in a year? Still insane
@@alexeydmitrievich5970 it's a new product, of course they are gonna have a lot of breaking changes
Yikes 😬
Uhu, so your customers had to rewrite the entire integration 7 times in the same year? So sad for people with real projectd
I know the joke is that "this meeting could be an email"
But i feel UT could just be a blog describing best practices for S3 and or a config file.
What is the product offered?
😂 savage
So uploadthing is an abstraction on s3? S3 already has a dead simple API so what am I missing?
The most astonishing thing is how this can be a product someone pays for :) 99,99999% of is just S3.
Did you watch the video?
one year later.
"we made file reading 10x faster and lowered our cloud cost 10x by going bare metal server. "
it i always nice to see people gets excited when they re invent the wheel
How is it even possible to upload 4MB of images in 1.5 seconds, nooo, impossible, upload so fast. I mean what are we even watching...
Wow this is massive reduction in complexity! I hope though one day we'll have technology advanced enough to use this thing called "Your server" to store a file. Sure hope we would be able to achieve even less arrows on the graph then...
the worst part is, theo trash talk dhh's blog post about leaving the cloud just a year ago and he is slowly getting towards it...
Theo realized that he would be homeless if he continued using serverless
we will be able to use uploadthing to upload to our own google bucket? mind blowing!
This might be a stupid question, but what is the advantage this service provides over a library integrated on my server or front end?
@@aaronevans7713 I suppose you'd only use the AWS S3 SDK in the back-end server anyway and send pre-signed URLs to the front-end, right? Otherwise you'd have to push some form of credentials to the front end. Honest question: What's the issue with a 3.2 MB (uncompressed JavaScript) client in the back-end?
This puts a limit on the bandwidth available as you are proxying the file uploads to s3, if you have a ton of concurrent uploads you will also need to scale your own servers.
Not if the "Ingress Server" is on AWS EC2. Instead of paying S3 traffic coming from internet, they are paying S3 traffic from inside AWS (Which may be even cheaper).
Incoming traffic to the server from internet is free (Well, let's say included on the per-hour price)
@@framegrace1 I am not talking about pricing, but about bandwidth, S3 has distributed endpoints for content delivery and you can have 100s of people upload simulateniously at high mbps, on the other hand your one ec2 instance is limited to whatever mbps amazon has to it, and if you try to upload 4-5 big files at the same time (from different users with good bandwidth) it will bottleneck it for everyone
@@halfsoft If they use a normal single EC2 instance on the free tier, of course. But I guess they have someone who knows what they are doing.
@@framegrace1 And what is your point exactly? What i said is that to handle more concurrent users they will need to scale the number of instances they run. Then they need to use load balancing to distribute the content across the ec2 instances. And what is more you lose the advantages of the distributed infrastructure of s3 that amazon has built.
Legitimate question, but isn’t 1.5s to upload 3.2MB still really slow? I don’t know what kind of internet you have, but a 50mbps upload would’ve sent the data in 500ms, what is taking the extra second?
Groundwork, check the 5:58 mark.
@@ramonsouza9846 well, 11:15 is the *upgraded* version of the app... Sure it's faster, but I wouldn't call a turtle amazing when compared to a snail if it could have the speed of a rabbit...
So, if I write my own upload logic, instead of using serverless upload services (like uploadthing), my apps will be much faster?
1.5s to upload 4 images and a total of under 4MB? That's the fast version that has chat asking how it's possible?
Yeah haha, people never deal with massive uploads, nowadays SaaS is the goat
Anyone needs a s3 upload proxy?😮
But why? U can just upload to s3 directly.
Its the old convenience vs performance choice in software. A tale as old as time.
Did you do any load/performance tests for your UT Ingest Server? Would be really nice to have a video just on that :) Also scaling of this server is an interesting topic...
This can be simply solved by client notifying the server once the file upload done. This is just over engineering at its finest. His reasoning was there will be ghost files if the client didn't notify the server. Solution to that is client always upload to a temp location and move the file to actual location when client notified the file has been uploaded. And you setup a s3 lifecycle to delete files based on the update date.
does it allow for resume and improve the time for smaller uploads? Still, they made their changes and going back to S3 isn't feasible for their marketing too. Plus they now support other types of "buckets" so I guess it isn't just S3 being inefficiently use, instead it gives them marketing leverage to be more independent and agnostic
Doesn't moving files cost money with S3? Not sure
@@theairaccumulator7144once in the region you can transfer within the region for free. It going back out the region will then cost again
We always did that with Rails years ago using a free gem maintained by the community not a SaaS company
> This can be simply solved by client notifying the server once the file upload done.
Rule #1 of web security:
Never believe the client.
Hoped for more info about the new server ( why no serverless, what's the tech, etc. ), but this looks amazing and makes sense now 😊 Great video 😊
Sounds like serverless slop is circling back. Also Just uploading directly to S3 is theoretically still faster.
So the product is a S3 proxy server? Alright
Technically speaking, TH-cam is also just a proxy server on top of object storage ;)
Technically speaking that's only a part of their API
@@t3dotgg But YT manages their object storage :) I’m genuinely surprised there’s a market for what your company is offering-it’s something an above-average developer could probably knock out in a day as part of their sprint. That said, it takes real business savvy to identify a need and turn it into a viable product with customers. No criticism of your product at all-it’s more of an eye-opener for those of us in tech about how smart business moves can make all the difference.
@@hemanthaugust7217true, really amazing the javacripts guys can complicated everything, any reasonable back end dev finish that a single day with a lib
Everything is an API over a storage
hmm... i dont get it. why not just request s3 upload permission from the client and upload directly to s3? bit confused...
Interesting to see you share the thought process behind everything, helps to learn :)
What I have learned, when it comes to IT.. the absurd amount of work is usually necessary due to initial incompetence...
uploading should really just be a single chunked transfer http request with a single response. the server can easily athenticate that and save the partial data to get resumability, and more
Thanks @dsherc, this is insane.
Just side note: this could be a bit misleading:
I see that you mentioned we can’t check file size of filed being uploaded.
But even if we can’t dynamically check file size while uploading, we can limit the max file size via adding the size cap to the pre-signed post.
Essentially this is what i did: on upload requests we ask for the file size, we return the presigned url with file size added to the signature as cap.
I know this will sound smug and I am sorry, but: 100% faster should be 0 seconds. So 377% faster and 509% faster mentioned at 3:10 makes no sense, what do those numbers mean? How did you calculate them?
I believe he meant something like:
100% = two times as performant -> final time x/2
377% = three dot 77 times as performant -> final time x/3.77
If it took 5 seconds and now I takes me 1 second I would say my thingy is doing 500% better, because I can do one thingy five times in the time the old thingy took to do one
Impressive. But wouldn't it be even faster, if we remove some more requests and roll the "Your Server", "UT Ingest Server" and "S3" components into a single thing managing uploaded files? Something that kind of works as a common base for data?
You've truly mastered the art of making things simple (or should I say, too simple) while monetizing the convenience. Well played. 👏
Oh cool, now my third party upload service has access to all the data I store. Neat.
They already had access before, no? It's their S3 bucket
@@danhorus oh right, I got the impression it was clients own authentication and direct upload to S3. I obviously don't understand what this solution provides.
@@dancarter5595 An easier way to upload things? They also add some code to the process so you don't need to do it yourself. I mean it's like using Vercel so you don't have to set your infra.
Yeah, that's the thing here that sort of defeats using it for anything production that is user-data sensitive. In EU at least, cause the us ofc doesn't care for user data. Because you are going to be in breach of GDPR. Since you are the administrator of the data, you cannot share it with 3rd parties without consent.
I think you are ignoring a huge security loophole in your logic. If the browser gets the presigned URL, then they can just use it directly without having to go through your ingest server, thus ending up with ghost files anyways
But now your server handles all traffic which cloud be a problem if it doesn't hold well.
That upgrade sounds as the logical path. Amazing optimization and simplification from user perspective!
Fwiw, Lambdas are not the only way to have serveless compute in AWS. ECS Fargate also offers the benefits of serverless (scale to zero, pay for what you use, etc) without the limitations of Lambda.
With the amount of time webdev goes full circle I am surprised we never get dizzy.
Serverless has become a huge pain, I'll definitely not use it for a new project.
It was just another pointless sales pitch...
This is such a ridiculous comment. I can provide dozens of real examples where serverless has transformed team productivity.
@@kevboutin everything is ok in the right context. Personally I tend to shy away from anything that names itself something that it clearly isn't. There are always servers...
@@m12652 so the name of something is your problem? The name is not a problem for me if it solves problems and increases productivity for less money. Priorities always vary I suppose. 🤷♀
Hey @t3dotgg, curios to know if/how you have mitigated against slowloris DoS attacks with the new architecture?
So the upload/forward from the UT ingest server to the "S3" is now not validated. Which means if the connection between those two fails for some reason at any point, you get invalid results. That is a huge cost.
In theory even if you had validation, the ingest server would need to store the files until the actual upload/forward completes. Yes, even if you practically pipe the upload directly between two sockets.
Additionally you keep connections alive (from the file upload to the ingest server) while waiting for the response of the external server. That's not good. If these servers take longer than expected to respond, your ingest server may stack a bunch of inactive sockets which it keeps open for no other reason than waiting. You essentially now have an external bottleneck for your hosted server, costing you resources.
Also as you said by yourself. The difference is much higher with smaller files which just means that your overhead from different requests got reduced. Because of course less requests means, less added latency. The percentages are kind of misleading. You would actually need a graph that shows difference depending on file size.
I love it. I had implemented the same structure you had in the past and I was planing on creating an ingest to propagate super similar to your architecture. That’s a great validation of concept.
I would love to use your project but I run all in GRPC to traffic the data.
It's pretty cool you naturally use a sequential diagram to explain it without even thinking about it or at least mentioning it.
The bring your own bucket is really important. We have contracts at work that specify we have to story customer data in Australia, so if we can't control where it's stored, we can't use the service.
"just forward the packet bro, don't process it" *makes app 5x faster*
Well that is nothing surprising, everyone should know that each serverless our cloud computation application always has an overhead. It is like saying, the new built file upload in rust is 10x faster than in javascript lol
I'd love to see Theo work on some Remix projects. Remix offers a great deal of built-in type safety, eliminating the need for extra implementation effort.
Did you account for filesystem caching before/after in your demo? I assume so, given the breadth of architecture changes you described. But caching recently used files in RAM (as modern operating systems tend to do) can make a very noticeable difference in responsiveness. Especially if the files come from spinning disks, network storage, RAID with parity, SATA SSDs... Pretty much anything but NVME.
Any kind of A/B performance testing where caching is a possibility requires either pre-caching all inputs (run it several times until the numbers look stable) or somehow guaranteeing that the inputs will never be cached. I'm sure you already knew that. But people often forget that it applies to their own demos.
If theo makes this fully free (100% self hosted for everyone) I will be very happy
It would be no longer a service tho
But could offer premium capabilities for companies
Who is this for? Why am I able to just do uploads to s3 in all my apps with aws apis/sdks without 3rd party packages to help, let alone a 3rd party saas service? Honest question. I just don't get why this exists or would be popular beyond maybe a brand new dev following a tutorial where s3 is just out of scope... I'm either too dumb to see where the value is or too smart to depend on a saas to do what the aws sdks do for free.
> Says "Honest question" then immediately shits on the people using it by calling them "brand new devs"
Assuming this is actually honest, maybe check out my other videos about UploadThing and S3? tl;dr - if you think S3 is easy to set up, your implementation is FULL of security issues and probably offers a bad user experience too
Most real companies with object storage have built their own UploadThing-like solution, but we're a generic that anyone can use at any scale :)
Even these pitches at the end "now you can bring your own bucket!" and "now you can run our server directly in your infra!" seem baffling to me. We already have our own buckets and our own infrastructure simply by using s3 directly. How are those selling points of introducing a saas between us and s3? Again, honest question. I have never felt more out of touch, and can't tell if thats a good or a bad thing hahaha.
You mean to say, removing a thing which causes you thing to be slow makes your thing go fast? 🤯🤯🤯🤯🤯🤯
This is just meant to be educational to show what things can be slow and how to resolve them for unexperienced developers who haven't reached or considered these steps on their journey.
@@sanjaux it's a joke. Jesus karen
@@macchiato_1881 That was my 2nd guess but I was seeing a lot of comments in bad faith so I really couldn't tell without any tone indicators lol. In a way my reply speaks to them too
@@sanjaux why do you need tone indicators? People like you need to handle negative comments better. I get not all criticism is good. But are you just going to whine at every valid negative criticism or joke you get?
@@macchiato_1881 Well the actual jokes no I'd ignore those, but criticism is best resolved through talking it out. Since this isn't criticism, more signs would have helped differentiate your joke from something actually worth discussing. Handle them better? I'm just trying to understand the thought process behind some comments (the serious ones)
BYOB: Bring Your Own Bucket
Time to remove all the sleeps in the code
Kudos Theo! And thank you for driving us away from serverless!
Great success! It's also quite cute that, even after so many live-streams and videos that you have done, you end up sounding a bit like a school kid presenting their project the first-time in front of the class, when you are talking about something that you are really proud of.
Are bandwidth costs negligible now? If not this seems much more expensive for UT to scale.
this is similar to investment, companies need to say something different, since most of them aren't innovative, instead they just go back and forth between things we have done in the past so people will invest in them.
Up next: I went back to client side react
14:50 No offense but weird to round one up and the other down, when both are 733 ms.
3733 ms -> almost 4 seconds
733 ms -> almost half a second
If the amount is greater than 1, it's natural to round to a whole number :)
Yeah, the double standard rouding is a bit cringe. But you can tell he's really happy, and with those numbers, I'd be happy too.
This just in. Serverless proven to be a buzzword to keep you purchasing overpriced subscription model technology. In other news, paint is wet when applied.
This statement is probably coming from someone who has never built any applications professionally using serverless solutions. It's a paradigm shift and one many people haven't wrapped their heads around yet. People fear what they do not understand and despise things that require LOTS of real world work to become proficient in.
C'mon bro, next you are gonna try to tell me water is wet or something?
Huge improvements! It's great that you feel ready to tackle enterprises, but I can assure you - it's not easy, not at all. Data privacy standards are more looked at than ever, so I'd first go for SOC, HIPAA and EU variants of those to have certificates you can shield yourself against quick-shot enterprise questions :)
As someone who works with AWS for 3rd party security reviews, those enterprise features sound nice. Still, there’s a LOT of config settings that AWS requires (that are not always cheap and is constantly changing) to be meet the bar.
Still, this is very cool infra design change and breakdown. I really appreciate this, folks who don’t work with AWS/cloud don’t understand.
I remember the whole serverless is designed to be short time, lightweight, infrequent requests for particular functionalities of your application. Hence the server doesn't need to run all the time and save your cost, and you don't need to maintaining the server.
Lately, it was abused massively for all kinds of heavy tasks, which should belong to your own server. And people complain the serverless.
The comment section is full of "devs" who say serverless is bad or host your own server is bad. Joke about the web dev, without understanding of those subtle details.
The current generation has huge skill issues imo.
@@doc8527 Yeah, I get to witness some real nutball spaghetti lambda design. If you need to mange over 50+ lambdas for your backend plus have one for every single API, troubleshooting & DevOps becomes a nightmare. Gotta watch every lambda metric, have so many cloudwatch logs etc. Thats where Vercel like companies do serverless a little better, they're taking on more of that burden, but its priiicy! I'd pay for it in a heart beat to save me time though.
Docker containers are where its at. Fargate/ECS that thing. Even EC2 management has improved a lot with CDK + SSM scripts.
Aka. We started using servers, the results are insane!
You improved your product by eliminating network hops as you should do. But the main component (s3) is still server less.
How do you host your ingestion server? Are you running your own k8s cluster?
hang on, are you basically getting the files from the clients now? Will you have the same bandwidth as the direct S3? Will you pay for the ingress traffic?
I think theo should ditch this project.
A lot of theo fans can gather the pitch fork at me , but just read other comments (ftp , you can directly do this in s3 , the point of uploadthing was that the data wasn't passing through their server , uppy exists
yeah , don't use uploadthing
I want to do audio file uploads through an api endpoint to s3 should I keep it as SQS infront of Lambda with maximum timeout or make a microservice
When we trigger S3 uploads/copies through various means, rather than having our API state update the front end we allow our client to hit a headObject presigned url to assert that the object has successfully landed. Requires some ugly polling but it’s cheap polling
Nah, I think I will stick with Rails Active Storage
Yes, frontend life is better when I can literally write everything in a single HTML file. Vanilla JavaScript + sometimes Web Components for the win. I'm a Helix IDE user so keep that in mind since that may also influence why I don't mind making web apps in that way.
Right after the Vercel sponsorship ended we get this..?
I'm doing a beginner's web dev course that has a file storage project. I ran into the latency issue with this architecture on day one. Originally I tried:
1. Client sends upload request to my server.
2. Server requests signed URL from Supabase.
3. Supabase responds with URL.
4. Server sends URL to client.
5. Client uploads and notifies server when it's done. 6. Server updates db and sends success response.
I can't center a div but I could tell this was horrifically slow! I noticed immediately and switched to streaming through my server to Supabase which was 2-3x faster for small files.
Serverless was a mistake, you’ve improved everything by just going the traditional route
Why do they call it serverless when there's still a server running it but it just has a lot of bloat as well?
@@theairaccumulator7144 haha they should be called pay per use bloat servers
Marketing @@theairaccumulator7144
@@theairaccumulator7144It's marketing. Just like the cloud and hypervisor. It sounds more cool than using someone else's server or dividing your computer's resources.
@@theairaccumulator7144 Because the developer doesn't need to worry about infrastructure (patching, load balancing, scaling, monitoring solutions). Imagine your services grow exponentially and your bare metal servers or virtual compute costs shoot through the roof because you cannot stay ahead of the usage spikes. If you know what you are doing with serverless, you will easily solve all of these problems/challenges.
Really cool stuff! Now that the infra is more flexible, something I'd love to see in UploadThing in the future is Cloudinary-like image transformations. UT would become a viable Cloudinary competitor with that! Could also be part of PicThing if you plan on doing more with it than just background removal :)
You should take a look at how PicThing is handling images ;)
What would the pricing structure be like for BYOB?
Now it's faster - but it also cost more money - you need to run server, you need to pay for the bandwidth and so on. So it's a trade off - you will pay more for your infra - you will get better user experience. It's the same as with Auth, you can use 3rd party auth system, which saves you ton of work but you can't control the user experience to the very details.
Infra matters more than what frontend/client could ever achieve.
Because on frontend you can only show the loader nothing else because client has limited internet bandwidth.
But what did you use to build your ingest server?!?! typescript? .NET? Go? Rust? Something else??? I wanna know the details about your serverFULL architecture!!! There's no details in your blog post either about what you used to build your ingest server in, how it's hosted, etc. I'm extremely interested in what you landed on for those tech choices.
I wonder how pricing would work with "bring your own bucket". But we're very excited for it since our organisation has rules on what geolocation a bucket can exist in. And even just using local infrastructure.
Just make your own infra/software for this. Waste of money spending it on upload thing
@@Itsneil17 you know that this is like saying "just make you WordPress"? I guess Upload Thing is simpler, but getting right is really hard. That's why we use abstractions that hide the real complexity
@@Itsneil17 making our own infra/software also costs money.
@@RedPsyched I've made my own infra for stuff like this. Yes it wasn't cheap at the start but now it costs less than using 3rd party
@@Qrzychu92 yes infact new discovery that not everyone uses wp. People use frameworks not a drag and drop editor for building websites.
It's kind of sad that resumable file transfer is a big feature now, because I remember it being a standard thing when I was a kid. It was lost somewhere along the way, and I'm glad to see someone is paying attention.
S3 doesn't support resuming!? Jesus Christ. This is exactly what I mean.
Dropping serverless improved performance? Surprised pikachu. /s
I see the upload to the bucket from the client browser goes through te ingest server and forwards to the bucket hosting server.
here is an idea for custom file scanning/checking:
can there be a future where a a website can host their own "approval server" that receives a connection from the ingest server,
and "listens in" on the file as it is being uploaded to the bucket server and gives a go/no back to the ingest server?
it doesn't seem like it slows down the upload (as it is being scanned as it is uploaded), takes barely any time to get the green light,
and if it gets rejected the ingest server just tells the bucket server to discard the upload and returns an error to the client browser.
with how fast "just forward the packet" seems to be, it is mostly up to the approval server to respond quick enough.
headers are always at the start and are the most checked thing to scan on,
so by the time the file uploaded the headers has been processed and a green light has been given to ingest.
Just an idea. let me know what you think.
Serverless as a concept is really cool, and I hope that the cost to performance ratio becomes better..
I thought it was gonna be like "not serverless anymore but ... edge"
bring your own bucket + file filtering seems intersting
1.3 seconds to upload 5mb doesn’t sounds quick, maybe I’m missing something but in 2024 this is awfully slow result