does it send full html code to model, or first sends dom xml tree, then asks for xpath to get the texts, then asks to format data, a small video on how it works internally would be helpful either using power point on canva or whiteboard.
a more indepth example of using open sourced model with ollama would be usefull for the entire comunity. use cases : scape from external file that has multiple URLS without having to embbed , etc
Darn … your knowledge and videos look potentially excellent but your accent make it impossible to understand. My grandfather had a thick accent when he immigrated and I recall him talking about how he practiced speaking English in a way that he got rid of his accent… these days you can probably just use ai but it’s up to you. If you want to grow your channel for English speaking people I would highly recommend you make yourself understandable. I hope you take this as constructive criticism because you have amazing potential. Thanks.
I am actually closing the video for the same reason, I just can't hear or understand.. I will be opening the github to see about possibly reading the docs. IF there was an AI to fix the audio level and change the accent it might be great info, but I don't know.
@@SteveBryant-x6d I think there are many options for using AI to clean up the audio. 11 labs comes to mind and I’m sure that there are others. It’s a simple speech to speech AI. So the person creating the video can either type in the audio or just talk and it will convert into a more understandable English. Hopefully this makes sense. Thanks.
His english speaking skills are just fine, but the audio quality is terrible. There is a high pitch tone throughout which is likely from a bad mic; it’s clear he didn’t bother to send the audio through any processing pipeline. My recommendation would be to use the “studio audio” feature in Descript.
Hei thanks for the criticism! Actually this video wasn't meant to be published on TH-cam and we were asked to do it quickly. For sure there is a lot of room for improvements :)
I'm not even a native English speaker and I understand everything perfectly so I'd say it's you who needs to improve his English to understand better other accents 😂 And it's very simple to prove: the automatically generated subtitles get most words right so the accent is just fine. However I did need to rise the volume to the maximum to hear clearly.
Awesome! I’m excited to try it out. Thank you for all the effort and time on a scraper that will actually SAVE me time. Cheers
Wow. This is a nice project. Well done. I especially liked the Ollama implementation.
Wow super impressive and well structured!
Does the scraper also work with SPA websites? Many pages must first be rendered by JavaScript before the data can be processed.
does it send full html code to model, or first sends dom xml tree, then asks for xpath to get the texts, then asks to format data, a small video on how it works internally would be helpful either using power point on canva or whiteboard.
Sure! I will add it to the list ☺
This is excellent work
How does it fair iterating through multiple sites with SmartScraper and with slightly more complex queries?
This is interesting, thanks for the library
Is it possible to set the schema for our json output. similar to what you'd get from instructor?
a more indepth example of using open sourced model with ollama would be usefull for the entire comunity. use cases : scape from external file that has multiple URLS without having to embbed , etc
can this scrape information from ? thanks!
Cool. So this is a graph!? Is there en example how to use more complex edges for a scraper?
it will work on a website with login section?
Is it just 1 page? Can it go down a line of links from pages?
sorry but yes, I tried
Italiani OWO
hello
Darn … your knowledge and videos look potentially excellent but your accent make it impossible to understand. My grandfather had a thick accent when he immigrated and I recall him talking about how he practiced speaking English in a way that he got rid of his accent… these days you can probably just use ai but it’s up to you. If you want to grow your channel for English speaking people I would highly recommend you make yourself understandable. I hope you take this as constructive criticism because you have amazing potential. Thanks.
I am actually closing the video for the same reason, I just can't hear or understand.. I will be opening the github to see about possibly reading the docs. IF there was an AI to fix the audio level and change the accent it might be great info, but I don't know.
@@SteveBryant-x6d I think there are many options for using AI to clean up the audio. 11 labs comes to mind and I’m sure that there are others. It’s a simple speech to speech AI. So the person creating the video can either type in the audio or just talk and it will convert into a more understandable English. Hopefully this makes sense. Thanks.
His english speaking skills are just fine, but the audio quality is terrible. There is a high pitch tone throughout which is likely from a bad mic; it’s clear he didn’t bother to send the audio through any processing pipeline. My recommendation would be to use the “studio audio” feature in Descript.
Hei thanks for the criticism! Actually this video wasn't meant to be published on TH-cam and we were asked to do it quickly. For sure there is a lot of room for improvements :)
I'm not even a native English speaker and I understand everything perfectly so I'd say it's you who needs to improve his English to understand better other accents 😂
And it's very simple to prove: the automatically generated subtitles get most words right so the accent is just fine. However I did need to rise the volume to the maximum to hear clearly.