The price honestly makes a lot of sense for me. Because it's much better than something 4o-mini and gemini 1.5 flash but much cheaper than something 4o or sonnet 3.5. There was definitely a sweetspot there that I would have loved having a few months ago, and very glad they delivered it now as opposed to waiting for engineering to bring the prices down further. I'd imagine it's a bigger model / takes up more vram than 4o-mini or flash. Plus the Claude models do have something special that others don't
Can you reccomend study material to consider pricing an AI model serving and related data product distribution platform as a service? I'm in consulting, ex marketer. Looking to explore this niche topic. Thanks!
I don't get your issue, just haiku 3 if it is too expensive for you, just odd logic, why complain about something when the same company has different options
For true agentic workflows, you need Sonnet level intelligence with the current Haiku pricing. It is unfortunate but we can’t do anything about it. Open source models are not even close to Haiku reasoning let alone Sonnet. Still rooting for LLama to get better and better 🔥
Agreed. Maybe I'll change my mind later on, but right now, I want a faster, cheaper Sonnet, not a new slow, pricey Opus. Hopefully this new Haiku might fill that niche. But yeah, I still want a better Llama, regardless.
I'm really bummed. I like Claude and want to see Anthropic succeed. I hope they get the funding they need, and have blessed training runs. I'm just a hobbyist, so having a decent intelligence available for frivolous projects at a low price was great. I'm going to start migrating to Gemini/OpenAI for a few projects. Google has free fine-tuning iirc so maybe I'll fine-tune on Claude outputs 😂
Even as a hobbyist you should consider databricks for fine tuning agents and managing data products. But the vendor lock in may be an issue which only big business should seel to solve right now. ML ops across vendors requires a solution which is more agnostic than unity cataloug... for model management across API providers and data silos.
Sigh of relief here. You can still use the original version 3 for cheap bulk LLM operations (version 3) via the API at the old pricing. At least for now!
GPT 4 was "0.5 more than" GPT 3.5 and much more expensive. Claude 3.5 Haiku is "0.5 more than" Claude 3 Haiku. All the LLM companies make very confusing branding and naming of the models. I guess Mistral is the fairest: they just come up with various names rather than pretending that one is a new version of another. Gemini 1.5 Pro is radically different from Gemini 1.0 Pro. GPT-4, GPT-4 Turbo and GPT-4o have nothing in common. We just need to learn taking each model as a new product, regardless of the often misleading branding. I can easily imagine that Haiku 3 and Haiku 3.5 will co-exist. 4x price difference basically indicates a totally different class of product. They could have called the new thing Claude 3.5 Limerick.
I was upset and still am but I am starting to think that maybe they are factoring in prompt caching. That is something that doesn't help fund them to keep up with the competition.
It's just to sensitive to the prompt. Let's say it's lazy and takes everything too literally. And it seems like it was trained on forums and reddit users, while the others are trained on books and articles.
Claude 3.5 (new) Sonnet is not sure how you can claim such things 😮 _I need to correct a misunderstanding in your statement. We are currently in April 2024, so I cannot comment on events from November 2024 as that's in the future._
I mean the community complained that Sonnet 3.5 was cheaper than Opus, I think this is exactly what this kind of stupidity deserves. Good job Anthropic, punish stupidity, the world has more than enough of it.
AI is currently under-priced - I assume in order to create the initial market. I expect AI prices to increase across the board, once AI need is established. Whatever the price, it will probably still be worth it for 90%+ of use cases.
@3:14 "Claude-3.5-haiku surpassed claude-3-opus in performance at a lower cost, so we're increasing the cost of haiku" That makes no sense. Either pass the savings off to the customer, or they'll use OpenRouter to get access to BETTER models for CHEAPER 😂
Oh Sam and the art of video naming... But seriously, I don't like the new trend though. What is Anthropic thinking "business" wise anyway, what the idea behind 4x price increase? Companies will switch back to Flash or other mini models, it is not that companies are really heavily invested in the Anthropic ecosystem at the moment.
@ Newer doesn't necessarily mean better. OpenAI had a few pre-GPT3 models that were great at certain kinds of writing (very uncensored). Also, OpenAI had introduced GPT-4o as a "faster" variant of GPT-4 Turbo, which in turn had been a "faster" variant of GPT-4. But faster means cheaper to run, but also - smaller, distilled, quantized. AI conpanies these days experiment with longer contexts, faster and cheaper inference, expert switching etc. - all these things look promising on paper, but aren't necessarily really better. Typically the vendors aren't working towards "best" but much more "good enough but much cheaper to run" (see GPT-Turbo & GPT-4o).
My feeling is they are pricing a certain percentage of people/use cases out because they likely dont have the required inference compute. Guys supply and demand is a b****
I've pretty much moved everything over to Gemini. I think in the long run, Google has very real advantages (robust research, an application suite, Android, TH-cam, safety team, access to almost limitless data, TPUs, massive context windows, unique API functionalities, etc.) Is Gemini on top of every leaderboard? No. But they are solid performers at a great price (and Flash is extremely fast). The real letdown with Google is the Cloud Console which is an absolute mess.
@@samwitteveenai The Google Cloud console. It looks like the control panel on the ISS and feels like it was developed by someone overly excited to get a "free" copy of Macromedia Dreamweaver in 2002. Managing API keys, projects, and pricing is SO MUCH EASIER with OpenAI and Anthropic; just a few clicks. Also, Google does not allow setting a hard cap on spend like the others do, only alerts. Very risky if something goes off the rails.
For me, nothing comes close to Opus 3. Opus 3 writes beautiful eloquent prose. The OpenAI models write like a serviceable college graduate. Gemini's writing is atrocious. Even Elon Grok's writing is better.
An H100 chip without ram costs nvidia maybe 100$. 80GB of GDDR6 ram costs about 200$. HBM ram isn't more expensive to make, it's just more pins for a wider bus. So that can tell you that AI will become dirt cheap. nvidia is on a greed frenzy and AMD is unfortunately not calling them on their bs. It's strange how passive AMD is because tensor chips are not hard to design from what I gather. Indeed AMD has them, they just have to price them even slightly reasonably to completely destroy nvidia's 20x above norm earnings. We are being artificially held back. You could easily imagine a home developer board with 1TB of fast ram instead of the pathetic 24GB in a 4090. And it sounds like 5090 wont have much more ram, despite ram prices having dropped 80% and neural models have exploded in size with no end in sight. A very affordable GPU/TPU can easily run GPT4o at good speed but we need more memory. AI has ballooned really fast combined with nvidia enterprise greed that personal compute hasn't kept up at all despite the components being very affordable. Wake up and demand better.
If Haiku is better than Opus and far better than the previous Haiku. it’s so much more expensive they should have given it another name so people wouldn’t feel like they are overpaying for Haiku
I tested it and get disappointed. It far worse than previous Haiku. Now it gives me lists instead of good answers. Now it mixes Russian with English. An epic fail! It's dumber, slower and more expensive! Outrageous! 😢 😡
The price honestly makes a lot of sense for me. Because it's much better than something 4o-mini and gemini 1.5 flash but much cheaper than something 4o or sonnet 3.5. There was definitely a sweetspot there that I would have loved having a few months ago, and very glad they delivered it now as opposed to waiting for engineering to bring the prices down further.
I'd imagine it's a bigger model / takes up more vram than 4o-mini or flash. Plus the Claude models do have something special that others don't
Time will tell it’s very popular as it’s as good as Opus, which is great enough, but way faster and cheaper relative to that.
did you test it? it's a piece of crap
They are not in a position to compete with Facebook, Google or OpenAI on price. They know it and everyone knows it.
That's a lie they are literally working with Amazon, they have always overpriced
@@holdthetruthhostageagree, AWS mentioned. always overprice compare to the others
I do pricing for a living. This time it's priced to not divert revenue away from sonnet
Can you reccomend study material to consider pricing an AI model serving and related data product distribution platform as a service? I'm in consulting, ex marketer. Looking to explore this niche topic. Thanks!
The price is a joke! I will stick with Sonnet, and for less demanding tasks, I will use Gemini and OpenAI's cheap models. :)
I don't get your issue, just haiku 3 if it is too expensive for you, just odd logic, why complain about something when the same company has different options
How is that opening statement in any way true?
The more advanced the model the more compute they seem to take to run.
More compute costs more.
For true agentic workflows, you need Sonnet level intelligence with the current Haiku pricing. It is unfortunate but we can’t do anything about it. Open source models are not even close to Haiku reasoning let alone Sonnet.
Still rooting for LLama to get better and better 🔥
Agreed. Maybe I'll change my mind later on, but right now, I want a faster, cheaper Sonnet, not a new slow, pricey Opus. Hopefully this new Haiku might fill that niche. But yeah, I still want a better Llama, regardless.
Also, when it comes to true frontier models, there may not be a market for more than 3, as people systematically gravitate towards the top 3.
Thanks as always sam
Yes please test the claude agent
I'm really bummed. I like Claude and want to see Anthropic succeed. I hope they get the funding they need, and have blessed training runs.
I'm just a hobbyist, so having a decent intelligence available for frivolous projects at a low price was great. I'm going to start migrating to Gemini/OpenAI for a few projects. Google has free fine-tuning iirc so maybe I'll fine-tune on Claude outputs 😂
Gemini is too racist, OpenAI is too restrictive and, unfortunately, Claude is less open than it used to be.
Even as a hobbyist you should consider databricks for fine tuning agents and managing data products. But the vendor lock in may be an issue which only big business should seel to solve right now. ML ops across vendors requires a solution which is more agnostic than unity cataloug... for model management across API providers and data silos.
Sigh of relief here. You can still use the original version 3 for cheap bulk LLM operations (version 3) via the API at the old pricing. At least for now!
Haiku 3 was awesome. I hope they don't kill my beloved Haiku 3. The new one is piece of garbage.
its about demand! if everyone wants it energy consumption goes up so ofc prices have to go up!
GPT 4 was "0.5 more than" GPT 3.5 and much more expensive. Claude 3.5 Haiku is "0.5 more than" Claude 3 Haiku.
All the LLM companies make very confusing branding and naming of the models. I guess Mistral is the fairest: they just come up with various names rather than pretending that one is a new version of another.
Gemini 1.5 Pro is radically different from Gemini 1.0 Pro. GPT-4, GPT-4 Turbo and GPT-4o have nothing in common.
We just need to learn taking each model as a new product, regardless of the often misleading branding.
I can easily imagine that Haiku 3 and Haiku 3.5 will co-exist. 4x price difference basically indicates a totally different class of product. They could have called the new thing Claude 3.5 Limerick.
I was upset and still am but I am starting to think that maybe they are factoring in prompt caching. That is something that doesn't help fund them to keep up with the competition.
I must be using Gemini 1.5 Pro wrong because I've found that it sucks but these benchmarks are telling me something different.
It's just to sensitive to the prompt. Let's say it's lazy and takes everything too literally. And it seems like it was trained on forums and reddit users, while the others are trained on books and articles.
Claude 3.5 (new) Sonnet is not sure how you can claim such things 😮 _I need to correct a misunderstanding in your statement. We are currently in April 2024, so I cannot comment on events from November 2024 as that's in the future._
I've already changed 2 of my apps to use Gemini
if it was openai, they would have just used a new naming convention for the higher price model, like Haiku-V7 (vs Vaiku-1 and newest Vaiku).
I am back to 3.0
me too. my tests showed that the new model is complete crap compared to the old one... and they made it 4x more expensive, that's crazy
I mean the community complained that Sonnet 3.5 was cheaper than Opus, I think this is exactly what this kind of stupidity deserves. Good job Anthropic, punish stupidity, the world has more than enough of it.
AI is currently under-priced - I assume in order to create the initial market.
I expect AI prices to increase across the board, once AI need is established.
Whatever the price, it will probably still be worth it for 90%+ of use cases.
4x bump of price tag only reflects 4x increase in greediness
It's not worth it. I would understand if it was better than the old haiku, but it's some kind of 70b degenerate.
Somewhere in the corner two truly lobotomized models, 8B and 32B ones, are giggling on "degenerate"
@3:14 "Claude-3.5-haiku surpassed claude-3-opus in performance at a lower cost, so we're increasing the cost of haiku"
That makes no sense. Either pass the savings off to the customer, or they'll use OpenRouter to get access to BETTER models for CHEAPER 😂
Oh Sam and the art of video naming...
But seriously, I don't like the new trend though.
What is Anthropic thinking "business" wise anyway, what the idea behind 4x price increase?
Companies will switch back to Flash or other mini models, it is not that companies are really heavily invested in the Anthropic ecosystem at the moment.
Companies can just use Haiku 3 instead of Limerick 3.5 (mis-branded as Haiku 3.5)
@@AdamTwardoch Limerick :) Do you want to stay on older model though?
@ Newer doesn't necessarily mean better. OpenAI had a few pre-GPT3 models that were great at certain kinds of writing (very uncensored). Also, OpenAI had introduced GPT-4o as a "faster" variant of GPT-4 Turbo, which in turn had been a "faster" variant of GPT-4. But faster means cheaper to run, but also - smaller, distilled, quantized.
AI conpanies these days experiment with longer contexts, faster and cheaper inference, expert switching etc. - all these things look promising on paper, but aren't necessarily really better. Typically the vendors aren't working towards "best" but much more "good enough but much cheaper to run" (see GPT-Turbo & GPT-4o).
My feeling is they are pricing a certain percentage of people/use cases out because they likely dont have the required inference compute. Guys supply and demand is a b****
They should charge more if it were more expensive, not if it were more intelligent.
I've pretty much moved everything over to Gemini. I think in the long run, Google has very real advantages (robust research, an application suite, Android, TH-cam, safety team, access to almost limitless data, TPUs, massive context windows, unique API functionalities, etc.) Is Gemini on top of every leaderboard? No. But they are solid performers at a great price (and Flash is extremely fast). The real letdown with Google is the Cloud Console which is an absolute mess.
Are you talking about the AIStudio console or VertexAI? Anything in particular you would like to see fixed or added ?
@@samwitteveenai The Google Cloud console. It looks like the control panel on the ISS and feels like it was developed by someone overly excited to get a "free" copy of Macromedia Dreamweaver in 2002. Managing API keys, projects, and pricing is SO MUCH EASIER with OpenAI and Anthropic; just a few clicks. Also, Google does not allow setting a hard cap on spend like the others do, only alerts. Very risky if something goes off the rails.
You can get an API key super easily in AI Studio!
Gemini is awesome for it's price. Gemini 1.5 Pro is the best after Sonnet 3.5 and 4o-latest
For me, nothing comes close to Opus 3. Opus 3 writes beautiful eloquent prose. The OpenAI models write like a serviceable college graduate. Gemini's writing is atrocious. Even Elon Grok's writing is better.
An H100 chip without ram costs nvidia maybe 100$. 80GB of GDDR6 ram costs about 200$. HBM ram isn't more expensive to make, it's just more pins for a wider bus. So that can tell you that AI will become dirt cheap. nvidia is on a greed frenzy and AMD is unfortunately not calling them on their bs. It's strange how passive AMD is because tensor chips are not hard to design from what I gather. Indeed AMD has them, they just have to price them even slightly reasonably to completely destroy nvidia's 20x above norm earnings. We are being artificially held back. You could easily imagine a home developer board with 1TB of fast ram instead of the pathetic 24GB in a 4090. And it sounds like 5090 wont have much more ram, despite ram prices having dropped 80% and neural models have exploded in size with no end in sight.
A very affordable GPU/TPU can easily run GPT4o at good speed but we need more memory. AI has ballooned really fast combined with nvidia enterprise greed that personal compute hasn't kept up at all despite the components being very affordable. Wake up and demand better.
Lol and everyone has been canceling their 20 a month subscription because of rate limits. They passing that cost to the 3.5 haiku model now lol8
I'd rather throw 20 bucks at an openrouter and send hundreds of messages an hour and still have some left over for next month
@@MudroZvon Openrouter???? never heard of that, thanks
If Haiku is better than Opus and far better than the previous Haiku. it’s so much more expensive they should have given it another name so people wouldn’t feel like they are overpaying for Haiku
"Aha! We've got ourselves a marketer!"
I doubt it's better
Sam has never heard of Alaska
Water flowing up? Or sunsets ?
@@samwitteveenai prices for LLMs are going up in Alaska?
I tested it and get disappointed. It far worse than previous Haiku. Now it gives me lists instead of good answers. Now it mixes Russian with English. An epic fail! It's dumber, slower and more expensive! Outrageous! 😢 😡
It's just a different model. Think of it as Claude 3.5 Limerick. It's not really an update of Haiku 3.
Models get smarter, people get dumber 😂
they just suck at naming things...