Apify Basics: How To Scrape Anything In Minutes (No Code)

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ธ.ค. 2024

ความคิดเห็น • 47

  • @MauLi2010
    @MauLi2010 16 วันที่ผ่านมา

    Thank you Jono for your long form and detailed content! You are my second teacher after make academy! Your real life use cases are super helpful. Cheers and keep up the great, detailed work! :-)

    • @jonocatliff
      @jonocatliff  12 วันที่ผ่านมา

      Hahaha, thank you so much, really appreciate it! If you want the Make.com academy summed up in one video, you can find it here: th-cam.com/video/MpmpC4C5fZs/w-d-xo.html

  • @charleyanderson9893
    @charleyanderson9893 หลายเดือนก่อน +2

    For anyone having the Function If error problem when copying the JSON from Apify, I found if when pasting JSON into the Make module you right click and "paste as plaintext" instead of just Paste solved it for me.

    • @ssmpm
      @ssmpm 25 วันที่ผ่านมา

      Good tip. I was using the DevToys app in the Windows store to correct the JSON code copied from Apify

  • @tonyivan508
    @tonyivan508 3 หลายเดือนก่อน

    Loved your content brow! thank you for all!

    • @jonocatliff
      @jonocatliff  3 หลายเดือนก่อน

      Thank you very much!

  • @FindMyLocal
    @FindMyLocal 2 หลายเดือนก่อน +3

    Thank you Jono, this looks amazing but I always get an error - The operation failed with an error. Function 'if' finished with error! Function 'validateInputJSON' finished with error! Please check that your input is a valid JSON - I have followed the video but always get error

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Hey! The error you're receiving is saying that the way the data is structured isn't correct. In Make.com, data is structure in JSON format, and it's very strict - if there's an extra comma, quotation or anything else added or missing, it'll throw that error. Even if you run it through ChatGPT, and get it to correct your JSON, sometimes it'll still throw that error because Make.com wants your JSON to be structured a certain way.
      Based on the limited information I have, this is really all I can say. But I would throw the code into a 'parse JSON' module and only run that module and go back and forth with ChatGPT asking it to restructure the JSON in a way that Make.com is happy with until I get it right.

    • @غيرمعرف-غ6ب
      @غيرمعرف-غ6ب 2 หลายเดือนก่อน

      me2

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Did you run the Apify actor synchronously in Make.com? You'll have to check the box

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      I just tried re-downloading the blueprint in the description and it worked for me. Did you try downloading it for free? jonocatliff.gumroad.com/l/ekunr

    • @jonocatliff
      @jonocatliff  หลายเดือนก่อน

      Hey there! Super crazy, I clicked the post button but my comment was never published. Have you tried downloading the blue print I provide for free? I just tried it and it works properly.
      The error you're encountering is because the JSON data isn't structured properly. JSON data is very sensitive, these errors come up often.
      Can you tell me the exact time stamp you're encountering this error so I can look into it - I assume you're running into this issue at module 1 or 3 in the Make.com scenario, correct?
      Here's how I'd solve it:
      1) download my blue print, it should work: jonocatliff.gumroad.com/l/ekunr
      2) reset the Apify actor, try again. When changing the code, make sure to leave in any quotations "", or commas ",". Do not add any special characters such as the following 3 characters: quotations "", or quotations '', or quotations ``. Make sure there's always a comma after each key-value pair (i.e. "language": "en"), with the exception of the last key-value pair, which cannot have a comma on it.
      3) ask ChatGPT to restructure the JSON code for you so that it works with Make.com.

  • @mehmetmanavoglu1175
    @mehmetmanavoglu1175 หลายเดือนก่อน

    HI Jono thanks for the video it is a great introduction! I am building an automation on make using google maps extractor and contact details extractor from apify. However, I struggle with the realistic scenario that covers more than 200 urls which has to be scraped for contact details. Google maps extraction is working well. But the contact details actor is both timing out and generating seperate scrape sessions for each url and no matter what I do, I couldn't fix the issue (working together with OpenAI o1-preview). What would you suggest in my case as the best practice? I am feeling very stucked atm. Thanks.

    • @jonocatliff
      @jonocatliff  หลายเดือนก่อน

      Hey there, you'll need to separate these into different workflows, instead of being in one. Apify will time out after 120 seconds in Make.com. To overcome this, you'll need to:
      1. Run the actor in the first scenario
      2. Wait for new actor runs in the second scenario
      This will avoid the timing out problem you're running into. Hope this helps :)

  • @vtheron
    @vtheron 2 หลายเดือนก่อน

    Ran the 2nd actor and got this error "The operation failed with an error. Input is not valid: Items in input.startUrls at positions [0] do not contain valid URLs". I suspect the error occured due to empty or invalid url. Do you know how i can resolve this?

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Hey there! You're correct, it's to do with the URL. Usually what I like to do is plug the error into ChatGPT and have it solve it for me. Let me break this down:
      1. [0] in coding is referring to the index position 1, meaning it's the first URL you passed in
      2. input.startUrls, is just the URL that you are required to enter
      There's one of two outcomes here:
      1. You didn't enter a URL, which is unlikely
      2. You didn't format the URL properly, which happens to me as well. Just make sure to include the , the subdomain, and everything else. If in doubt, go to the webpage, and copy it from the URL bar.
      Hope this helps!

  • @christopher60625
    @christopher60625 26 วันที่ผ่านมา

    Ran this exact sequence, Google and Content scraper (in Apify) showed over 1,000 results but a max of 60 rows get added in Sheets. Yes, I ran it twice with the same result resulting in drained credits in Apify. No clue what I'm doing wrong.

    • @jonocatliff
      @jonocatliff  21 วันที่ผ่านมา

      Hey Christopher, it's a bit hard for me to dissect exactly what's going on with the limited information I have, but here's a couple things you could try:
      1. Put a sleep before the Google Sheets module to avoid 429 rate limiting errors
      2. Try breaking the scenario into two modules, the first module running the actor, and the second module watching the actor runs and everything past that.
      3. Make sure you set up error handling, in case of the modules fails and it stops the scenario.
      4. Make sure the Make.com scenario isn't timing out after running for too long. If this is the case, you'd have to break up the scenario even further by webhook/http requests
      Hope this helps, best of luck :)

  • @kalam1ty
    @kalam1ty 21 วันที่ผ่านมา

    for some reason apify is not letting me make an account it looks like its going to go through but then it goes back to the create account page! Any recommendations?

    • @jonocatliff
      @jonocatliff  19 วันที่ผ่านมา

      Hey there, I would either try deleting my cache on the Apify.com or try an incognito tab or another browser (firefox, safari, google chrome). If that doesn't work, I'd try another computer. Hopefully that helps!

  • @ssmpm
    @ssmpm 25 วันที่ผ่านมา

    Since Module 3 needs a Website to scrape, I added a filter that checks if Website exists before continuing. Now I need to figure out how to output the results that don't have Websites, or add a Module that will scrape a different site instead, such as Yelp or BBB

    • @jonocatliff
      @jonocatliff  21 วันที่ผ่านมา +1

      Hey there, this is a fantastic idea. Personally, I just filter them out, but you can also do this approach too :) Thanks for sharing

  • @reggychiller7979
    @reggychiller7979 หลายเดือนก่อน

    How can I save the last scraped ID so the scraping process is not scraping the same listings. In apify there are is the stores section where I can take the api. Is the solution to make an api GET call before running the first apify module?

    • @jonocatliff
      @jonocatliff  หลายเดือนก่อน

      Hey, great question! So out of the box with most of these scrapers on the store I don't think you can - if you find a way please let me know. Otherwise, you'd essentially have to work around this as a limitation.
      1. The real way to make this work would be to either create your own script in Apify or go into the code of the scraper and change it so that it first checks to see if the listing has been scraped before and filter it out if it has.
      2. Make sure that if you scrape a particular keyword 'landscapers new york', you scrape everything so there's no need to re-scrape the same data. To do this properly, you'd need to break the workflow in Make.com into two parts, otherwise if you wait for it to complete in one scenario with the 'synchronous' setting, it'll time out after 120 seconds
      3. If you push these into a Google Sheet, you can check to see if the result exists and filter it if it doesn't, so you don't have duplicates.

  • @maximodejesus6390
    @maximodejesus6390 2 หลายเดือนก่อน

    Can these tools scrape local county clerk of court websites for pre foreclosure data?

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน +1

      Perhaps, I'm unfamiliar with that industry, but you can take a look at the store in Apify and there might be a solution for you in there

  • @isaiahkimiti6520
    @isaiahkimiti6520 2 หลายเดือนก่อน

    I am getting an error (Function 'if' finished with error! Function 'validateInputJSON' finished with error! Please check that your input is a valid JSON.) in the run actor module even though I have counter checked and confirmed the JSON is okay

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Hey Isaiah, thanks for mentioning this. I need a bit more information. Can you tell me exactly where in the workflow this error is occurring, or the timestamp in the video you're encountering this at?

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      I just tried re-downloading the blueprint in the description and it worked for me. Did you try downloading it for free? jonocatliff.gumroad.com/l/ekunr

    • @isaiahkimiti6520
      @isaiahkimiti6520 2 หลายเดือนก่อน

      @@jonocatliff it finally worked. Thank you for the follow up

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Perfect!

  • @CyberpunkSideral
    @CyberpunkSideral 6 วันที่ผ่านมา

    By God, how do I extract the phone number from a list of sites stored in Google Sheets? I've looked through all the videos that a human being is capable of and your videos are the ones that come closest to explaining it, but I can't find any that teach how to extract phone data from URLs in a spreadsheet and send it to a new spreadsheet or anything else. way that works. Please make a video about this, it's definitely a question from thousands of other subscribers. Greetings from Brazil!

    • @jonocatliff
      @jonocatliff  6 วันที่ผ่านมา

      Hey! Hahhaa, too funny, you're very close - don't give up! If you have a Google Sheet with website URLs, you'll want to use the "Website Content Crawler" in Apify to scrape the URLs, and if there's a phone number on the site, it'll pull it out by crawling multiple website pages.
      You can import these URLs directly into that scraper, or if you want to do it automatically, you'd set up a Make.com scenario with a Google Sheets trigger as "Get Range Values", set to A1:Z999, and pull in all of them, which by default it'll iterate (go through one-by-one) and you'll put them into another Make.com module, "Run Actor", with the "Website Content Crawler" actor, and then get the results with another Apify module, with the "Get dataset items", and paste those results back into the Google sheet.
      Hope this helps :)

  • @ceojenniferlee-oq9kt
    @ceojenniferlee-oq9kt 3 หลายเดือนก่อน

    How do you add in the name of the business and possible name of the owner. Thank you for all you do!

    • @jonocatliff
      @jonocatliff  3 หลายเดือนก่อน

      You have a few options here. If you want to do it manually, I think your best bet is to use either LinkedIn, or Apollo to scrape people at that company to grab their business contact info and email address

  • @OpeyemiDorcas-t7d
    @OpeyemiDorcas-t7d 3 หลายเดือนก่อน

    What can be done if the file is too long?

    • @jonocatliff
      @jonocatliff  3 หลายเดือนก่อน

      Hello - thanks for the comment. If I understand correctly, you mean the file as in the initial web scrape, correct?
      You'll have to make one small tweak here. When you run the actor 'synchronously', it automatically times out after 120 seconds - which is not enough time, let's say, to scrape 10,000 leads. What you'll need to do is separate the workflow into two parts.
      1. The first part is just running that actor - just one step.
      2. The second part would have a trigger 'watch for actor runs', and then once it's complete, it'll automatically fire off, and then you can connect that back into the 'get dataset id' section
      I explain how this looks like at the beginning of this video: th-cam.com/video/qwsB72PhM3E/w-d-xo.html
      Hope this helps. Let me know if this didn't answer your question!

  • @BobsWebofWonder
    @BobsWebofWonder 4 หลายเดือนก่อน

    Ha just found your channel. It is very interesting and informative. Im Bob 62 years young and I live in Australia. What would your advice be to me a compleat novice in this area of business. My problem is I need to learn the basics but where to start. Also what would be a good start business to learn with. Maybe you could create a video for us new to ai and automation. I have one problem with your channel and that is you need to improve your flow with better editing. For example if you are struggling to find something in a application you need to cut that out, as it is confusing. That you Bob.

    • @jonocatliff
      @jonocatliff  4 หลายเดือนก่อน +3

      Hey Bob! Thank you so much for taking the time to comment :)
      If I was in your position, I would start with Upwork.com, choose a niche and apply to jobs straight through there. You'll need 3 people to review you before you'll start getting real traction, but you could easily start charging $40-50+ per hour. I would also personally just binge watch everything to do with the niche I want to specialize in, and then start undercutting the price to get my first few clients, then progressively charge more as I had more experience to showcase.
      As for the editing, this is a great point. I'll do my best to change this on future videos.

  • @crawman75
    @crawman75 2 หลายเดือนก่อน

    copy the json in and keep getting an error

    • @crawman75
      @crawman75 2 หลายเดือนก่อน

      Got that fixed and I see it running in APFy but it got getting the Data I have the Default set

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Usually the JSON is because not structured properly. This could be because you might not have selected all the data, or perhaps you tried to change it and deleted a character. JSON unfortunately is very finnicky.
      If it's not getting the data it's likely because the data doesn't exist yet. Make sure to set it as synchronous, so that it waits for the webscrape to finish

    • @crawman75
      @crawman75 2 หลายเดือนก่อน

      @@jonocatliff Really even if when I run the autmation and I can see the results in APFY but it not pulling in to Make. When I run the process I can see it running in APFY but nothing in Make

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      Did you run the Apify actor synchronously in Make.com? You'll have to check the box

    • @jonocatliff
      @jonocatliff  2 หลายเดือนก่อน

      I just tried re-downloading the blueprint in the description and it worked for me. Did you try downloading it for free? jonocatliff.gumroad.com/l/ekunr