I've watched a few Windsurf videos and it looks great. Thanks for exposing some of its weaknesses. I'm still planning to dive into it. In spite of its imperfections it looks pretty amazing.
100%! A few years ago, it was hard to imagine something like this being possible. And now this is more or less a reality. Regarding its imperfections, I think this is more about the nature of how LLMs work than about Windsurf itself. Since they can only predict completions, there’s no real critical thinking possible. But I’m curious to see how these issues will be addressed in the future. If it were up to me, I’d probably try to add a couple of another AI agents to the ecosystem whose sole job would be to critique and challenge everything they see before presenting the final output. But let’s see what the future brings!
Thanks for testing it. I tried testing it myself with a limited Internet connection (unfortunately) and I was able to build a partial Windsurf 2, an AI IDE that used local AI/LLM, and worked similar to Windsurf but even better, and it sort of worked. But as the project got bigger, Windsurf struggled more and more to get anything right. It had a lot of dependencies and context issues. If I had a proper Internet connection I think I might have a 'Windsurf' 2 or 3 by now. The project is on hold until I get additional proper internet connection.
Thanks for the comment - it sounds like an exciting project! I’d love to hear if you manage to complete it till the end. There is another video on the channel about Windsurf where we also tested Windsurf’s ability to build an entire app (just a Tetris game, but still). The results were a bit questionable for me - everything looked professional and good but only until the testing began. So I’d really appreciate if you could share the outcomes of your experiment. Windsurf 3 sounds like a much bigger challenge than a simple Tetris game ;-) In general, the biggest issue I see right now is the lack of stability. AI doesn’t have clear boundaries and can easily decide to rewrite something unrelated to the task you gave it. As a result, you often have to sit down and carefully review all the changes, especially when it touches many files. This becomes particularly tricky with large projects that have an established structure that should stay untouched.
One trick to keep it from rearranging all my code is to let it write documentation and use this as context for new requests. You get nice documentation and have less unnecessary rewriting of existing code. I think windsurf is a very nice IDE with a lot of potential.
Interesting idea - I’d love to try that next time! How exactly do you use documentation as a context? Do you tell it explicitly to consider all the docs generated? Otherwise, what’s the difference for the AI machine between the code itself and the Javadocs it previously generated? Even in this video, there was a moment where some Javadocs were rephrased again. Curious to hear your thoughts!
Sometimes we think we know how to prompt. But trust me I used chat gpt to articulate for me and get much much better results with Windsurf and cascade. It's about the prompt.
I’m not sure ChatGPT and Windsurf can be directly compared. Windsurf isn’t just a "simple" LLM - it likely has optimized prompts, and it uses a lot of data as context when generating responses (at the very least, your entire project is available for Cascade). It also has tools to fetch and send even more data to the LLM. With ChatGPT, creating the right prompt is entirely up to the user and this is usually much more limiting.
@GrabDuck to articulate for me to Windsurf. I use my language in chatgpt. Then get chatgpt to prompt Windsurf. That's how I get my results from Windsurf. If chatgpt was better then I'd use that. I wouldn't even be on this page. I was just trying to help. Using a chat model to help with your prompts gets better results. Play them off against each other. Or just use them for what they were designed for. But if you pay like I do for chatgpt and codeium Windsurf. Then you will see most roads lead back to Claude or OpenAI. I just use it to create the file systems and formulate my code. One think I have noticed. If it's a large file system and you ask for a full full audit. It will not ever go through every file and folder no matter what you prompt it. 10 files max. So you also have to keep an eye on that. If you end up noticing things are not creating or whatever. Then I just switch between models. It's as if it has a slight underlining fair usage policy and they drop you a model. So be aware of that and switch the model if it starts performing bad. The write and code toggle is very important too.
@GrabDuck I said to articulate for me. Not to use instead or better than Windsurf. I pay for both and usually copy and paste from one to the other. Because i find when they speak the same language things get done better. I get too carried away in chats and don't get to the AI tech side. So I tell chatgpt (who was built to learn and chat with me) for chats and Windsurf for the coding.
@GrabDuck I love Windsurf and I've tried Bolt.new v0, cursor and for me nothing compares. The VSC integration and the file editing and creation. Running commands. You can't go wrong.
@PyJu80 Thank you for sharing your insights - it’s always great to hear different approaches! I also use the paid version of ChatGPT (which ultimately still relies on OpenAI), but I’ve never tried combining two models and making them “talk” to each other. My approach has always been to make clarifying adjustments when it turns out the LLM “misunderstood” me. Although, technically, your approach is somewhat similar to mine. With your first prompt to ChatGPT, you get an improved version of your own query, which provides a better direction with a more accurate vector for further processing in Windsurf, resulting in better outcomes. By the way, how do the chat and write modes differ? I’ve tried both but didn’t notice any significant difference.
The AI is not perfect. To limit the scope of the files to be looked at, please specify which files you are dealing with with the @ symbol in your prompts. From what I observed, the prompts were not always focussed and insufficient context was given, which is required for bigger projects. It is a tool and perhaps we need to learn to use it more effectively, perhaps Codeium should provide some tutorials.
Thanks for the comment! Really great tips tbh! From my experience, it all comes down to how clear and technically correct the prompt is, including setting limits on what NOT to do. Btw, limiting file scope like you said is a good way to keep things focused. One idea I had - what if we start documenting common issues with AI in a separate file (like project-specific or even for the whole team/company) and just include that file as context in every prompt? Could be interesting to try. What we showed in the video isn’t exactly how I’d use the tool myself. I see it more like a mix - sometimes autocomplete, sometimes manual coding, and AI for tasks that are repetitive but still clear. But for the vid, the idea was to follow a specific scenario-like a senior supervising a junior buddy, talking to the AI like it’s a real person (so no detailed prompts), just to see how well these agents can handle work on their own with a minimal oversight. And yeah, I totally agree, Codeium should def have some tutorials. I even checked their site for this, but found nada. 😅
I've watched a few Windsurf videos and it looks great. Thanks for exposing some of its weaknesses. I'm still planning to dive into it. In spite of its imperfections it looks pretty amazing.
100%! A few years ago, it was hard to imagine something like this being possible. And now this is more or less a reality.
Regarding its imperfections, I think this is more about the nature of how LLMs work than about Windsurf itself. Since they can only predict completions, there’s no real critical thinking possible.
But I’m curious to see how these issues will be addressed in the future. If it were up to me, I’d probably try to add a couple of another AI agents to the ecosystem whose sole job would be to critique and challenge everything they see before presenting the final output. But let’s see what the future brings!
Thanks for testing it. I tried testing it myself with a limited Internet connection (unfortunately) and I was able to build a partial Windsurf 2, an AI IDE that used local AI/LLM, and worked similar to Windsurf but even better, and it sort of worked. But as the project got bigger, Windsurf struggled more and more to get anything right. It had a lot of dependencies and context issues. If I had a proper Internet connection I think I might have a 'Windsurf' 2 or 3 by now. The project is on hold until I get additional proper internet connection.
Thanks for the comment - it sounds like an exciting project! I’d love to hear if you manage to complete it till the end. There is another video on the channel about Windsurf where we also tested Windsurf’s ability to build an entire app (just a Tetris game, but still). The results were a bit questionable for me - everything looked professional and good but only until the testing began. So I’d really appreciate if you could share the outcomes of your experiment. Windsurf 3 sounds like a much bigger challenge than a simple Tetris game ;-)
In general, the biggest issue I see right now is the lack of stability. AI doesn’t have clear boundaries and can easily decide to rewrite something unrelated to the task you gave it. As a result, you often have to sit down and carefully review all the changes, especially when it touches many files. This becomes particularly tricky with large projects that have an established structure that should stay untouched.
But then how do you run LLM'S locally and powerful enough to feed the data needed to complete your work? 4 x 4090's? 😅
One trick to keep it from rearranging all my code is to let it write documentation and use this as context for new requests. You get nice documentation and have less unnecessary rewriting of existing code. I think windsurf is a very nice IDE with a lot of potential.
Interesting idea - I’d love to try that next time! How exactly do you use documentation as a context? Do you tell it explicitly to consider all the docs generated?
Otherwise, what’s the difference for the AI machine between the code itself and the Javadocs it previously generated? Even in this video, there was a moment where some Javadocs were rephrased again. Curious to hear your thoughts!
Sometimes we think we know how to prompt. But trust me I used chat gpt to articulate for me and get much much better results with Windsurf and cascade. It's about the prompt.
I’m not sure ChatGPT and Windsurf can be directly compared. Windsurf isn’t just a "simple" LLM - it likely has optimized prompts, and it uses a lot of data as context when generating responses (at the very least, your entire project is available for Cascade). It also has tools to fetch and send even more data to the LLM. With ChatGPT, creating the right prompt is entirely up to the user and this is usually much more limiting.
@GrabDuck to articulate for me to Windsurf. I use my language in chatgpt. Then get chatgpt to prompt Windsurf. That's how I get my results from Windsurf. If chatgpt was better then I'd use that. I wouldn't even be on this page. I was just trying to help. Using a chat model to help with your prompts gets better results. Play them off against each other. Or just use them for what they were designed for. But if you pay like I do for chatgpt and codeium Windsurf. Then you will see most roads lead back to Claude or OpenAI.
I just use it to create the file systems and formulate my code. One think I have noticed. If it's a large file system and you ask for a full full audit. It will not ever go through every file and folder no matter what you prompt it. 10 files max. So you also have to keep an eye on that. If you end up noticing things are not creating or whatever. Then I just switch between models. It's as if it has a slight underlining fair usage policy and they drop you a model. So be aware of that and switch the model if it starts performing bad. The write and code toggle is very important too.
@GrabDuck I said to articulate for me. Not to use instead or better than Windsurf. I pay for both and usually copy and paste from one to the other. Because i find when they speak the same language things get done better. I get too carried away in chats and don't get to the AI tech side. So I tell chatgpt (who was built to learn and chat with me) for chats and Windsurf for the coding.
@GrabDuck I love Windsurf and I've tried Bolt.new v0, cursor and for me nothing compares. The VSC integration and the file editing and creation. Running commands. You can't go wrong.
@PyJu80 Thank you for sharing your insights - it’s always great to hear different approaches! I also use the paid version of ChatGPT (which ultimately still relies on OpenAI), but I’ve never tried combining two models and making them “talk” to each other. My approach has always been to make clarifying adjustments when it turns out the LLM “misunderstood” me. Although, technically, your approach is somewhat similar to mine. With your first prompt to ChatGPT, you get an improved version of your own query, which provides a better direction with a more accurate vector for further processing in Windsurf, resulting in better outcomes.
By the way, how do the chat and write modes differ? I’ve tried both but didn’t notice any significant difference.
it is missing up things same as Cursor
The AI is not perfect. To limit the scope of the files to be looked at, please specify which files you are dealing with with the @ symbol in your prompts. From what I observed, the prompts were not always focussed and insufficient context was given, which is required for bigger projects. It is a tool and perhaps we need to learn to use it more effectively, perhaps Codeium should provide some tutorials.
Thanks for the comment! Really great tips tbh! From my experience, it all comes down to how clear and technically correct the prompt is, including setting limits on what NOT to do. Btw, limiting file scope like you said is a good way to keep things focused. One idea I had - what if we start documenting common issues with AI in a separate file (like project-specific or even for the whole team/company) and just include that file as context in every prompt? Could be interesting to try.
What we showed in the video isn’t exactly how I’d use the tool myself. I see it more like a mix - sometimes autocomplete, sometimes manual coding, and AI for tasks that are repetitive but still clear.
But for the vid, the idea was to follow a specific scenario-like a senior supervising a junior buddy, talking to the AI like it’s a real person (so no detailed prompts), just to see how well these agents can handle work on their own with a minimal oversight.
And yeah, I totally agree, Codeium should def have some tutorials. I even checked their site for this, but found nada. 😅