Scraping LinkedIn Profiles with Python Scrapy (2022)

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ธ.ค. 2024

ความคิดเห็น • 72

  • @liatavskyioleksii7387
    @liatavskyioleksii7387 ปีที่แล้ว +4

    Dont work for now. Linkedin, from time to time sends a page where 141 li class, and it's not the page that we expect.

  • @sumeetkumarsingh9377
    @sumeetkumarsingh9377 ปีที่แล้ว +1

    I tried this but I am not getting any output. Some script runs but I am not getting the desired output. How to solve this issue.

  • @jannikendress2378
    @jannikendress2378 ปีที่แล้ว +2

    I am getting always the error code 500 internal server error, can anyone help me?

    • @kainguyen4259
      @kainguyen4259 ปีที่แล้ว

      Same here! I have not been able to get a lot of profiles. In fact, I only got like 3-4 profiles out of 200 links. :( It was not too bad yesterday, I got about 8-9 profiles out of 10 profiles in a testing round. No idea what went wrong today.
      Today though? It was able to crawl lots of profiles but wasn't scraping any of those, and then when I just tried now, it couldn't even crawl anything at all.

  • @makedatauseful1015
    @makedatauseful1015 2 ปีที่แล้ว +1

    You make useful lessons. Thank you

    • @scrapeops
      @scrapeops  2 ปีที่แล้ว

      Thanks for the feedback!

  • @Kattar_HINDU_hu253
    @Kattar_HINDU_hu253 หลายเดือนก่อน

    How can i scrape the contact info , if you can help please share me the code snippet

  • @parthshah6520
    @parthshah6520 11 หลายเดือนก่อน

    when i enter "scrapy list" in my terminal, the reponse is "quotes" instead of the spider. Can someone help me with this issue?

  • @codewithguillaume
    @codewithguillaume ปีที่แล้ว

    Thanks my friend

  • @YoussefBEZZARGA
    @YoussefBEZZARGA ปีที่แล้ว +1

    the profile.json file is empty, i followed all the steps and copied the code from The article that goes along with this video. What can be the problem?

    • @mirzaaseembaig268
      @mirzaaseembaig268 ปีที่แล้ว

      Assalamualaikum brother
      I'm also getting the same problem
      Have u find the solution of that problem, if yes kindly request you to help me

  • @voldyrama5489
    @voldyrama5489 ปีที่แล้ว +2

    I passed as a parameter a csv file with a list of 500 usernames to put them in the url but I only got a 19 profiles out of 500. what should I do to get them all ?

    • @marwanemouttaqi4407
      @marwanemouttaqi4407 ปีที่แล้ว +1

      Do you found a solution for thath ?

    • @mohamed0h0hamed
      @mohamed0h0hamed ปีที่แล้ว

      It seems that You have consumed all your FREE API credits. You should upgrade to a larger plan to activate more credits!!

  • @ayush-that
    @ayush-that 3 หลายเดือนก่อน

    Ran without any errors but returned an empty list in the profile.json. Can anyone tell me what to do?

  • @muhammedabdulsalam614
    @muhammedabdulsalam614 ปีที่แล้ว +1

    it runs with no errors for me but I don't get any data back

    • @AamirKhan-sg8zk
      @AamirKhan-sg8zk 9 หลายเดือนก่อน

      same it just returns an empty list

  • @HambaAllah-v1w
    @HambaAllah-v1w ปีที่แล้ว +1

    what is css selector for "open to work" status of linkedin user profiles?@scrapeops or anyone maybe can help me?
    try many times but still didnt works

  • @shahzaibsaqibwarraich6411
    @shahzaibsaqibwarraich6411 ปีที่แล้ว +1

    How to go about extracting skills and endorsements?

    • @scrapeops
      @scrapeops  ปีที่แล้ว

      You just add CSS selectors to extract it from the HTML response like the other data types.

  • @Alexandru.M.P
    @Alexandru.M.P 4 หลายเดือนก่อน

    Question if I wanted to scrape all the profiles from linkedin for a specific country. Say Hungary. How would I go about doing that ?

  • @influenceink9528
    @influenceink9528 ปีที่แล้ว

    So this bypasses the usual limits on most other tools?

  • @agility_matters
    @agility_matters 2 ปีที่แล้ว +4

    I've always had one major problem. That's getting the url links in the first place. When you want 10-20k profiles, manually doesn't make any sense.
    What do you think about choose one and then have it click to the next profile in their friends list?

    • @scrapeops
      @scrapeops  2 ปีที่แล้ว +5

      LinkedIn has higher protections on any page that lists profiles, etc. so it is harder to scrape reliably.
      A popular other option is to have a scraper query Google for a person's name and LinkedIn profile using advanced queries and get the LinkedIn username that way. You could feed in a list of people/company name combinations and get all the URLs you need.

    • @nashtashasaint-pier7404
      @nashtashasaint-pier7404 2 ปีที่แล้ว +2

      @@scrapeops I have been using that method for a little while now and can confirm it is effective at getting people's linkedin url. Most of the time, I get around 85 % of the accounts I wanted. If you really want the remaining 15 %, you will have to manually search.

  • @AmmarJaved-pb8gx
    @AmmarJaved-pb8gx ปีที่แล้ว

    bro i copy your github code but unfortunately not scrap data and also i cant found error except fingerprint

  • @thibaultnoname3938
    @thibaultnoname3938 11 หลายเดือนก่อน +1

    How do you replace "reidhofman" in profile_list by a list of usernames stored on a CSV ? Let's say I have 500 usernames I want to scrape, all stored on an Excel / CSV. Is it possible and how do you do it ? Thank you for the video very interesting

    • @dailywisdomquotes518
      @dailywisdomquotes518 9 หลายเดือนก่อน

      yes it is, i did it

    • @khouloudsafi1445
      @khouloudsafi1445 9 หลายเดือนก่อน

      Can you help me with this plz?
      Eventhough I am using the scrapeops proxy, I can't scrape the information I need. as it is asking for login, I tried to but it didn't work....
      @@dailywisdomquotes518

    • @SriramkiranDevarakonda-pr5bp
      @SriramkiranDevarakonda-pr5bp 7 หลายเดือนก่อน

      ​@@dailywisdomquotes518 Hie could you please help me doing this ?

  • @rahilakhan-w3g
    @rahilakhan-w3g ปีที่แล้ว +4

    hi , is this possible to scrape linked public listed data using selenium/scrapy WIHTOUT LOGIN ? I am using proxies but still its getting blocked. Can you please share ideas if you have any ?

  • @shashankdwivedi6388
    @shashankdwivedi6388 ปีที่แล้ว

    it is still showing me the error No module named 'scrapeops_scrapy' after reactivating virtual environment

    • @refinans2087
      @refinans2087 ปีที่แล้ว

      i am getting the same error too. @scrapeops could you please help?

  • @siddharthakar9369
    @siddharthakar9369 ปีที่แล้ว

    Can we scrap linkedin followers from my linkedin page

  • @mrmoux
    @mrmoux 2 ปีที่แล้ว

    Getting an error when trying to pip install scrapy (Python 3.11.1)
    ERROR: Failed building wheel for twisted-iocpsupport
    Same thing when trying to manually install Twisted. I was able to install it in a conda environment running python 3.9.15
    Anyone else encountered this issue? Haven't found any relevant info online for win os

  • @carlosjeanpierresaqui
    @carlosjeanpierresaqui ปีที่แล้ว

    The file that originates .json comes out empty, could you help me :c

    • @scrapeops
      @scrapeops  ปีที่แล้ว

      Did you git clone the code in the GitHub repo and ran that? Or did your write the code yourself?

    • @carlosjeanpierresaqui
      @carlosjeanpierresaqui ปีที่แล้ว

      @@scrapeops I wrote it myself using Jupyter Notebook

    • @scrapeops
      @scrapeops  ปีที่แล้ว

      @@carlosjeanpierresaqui writing it in Jupyter notebook is a lot different to the way shown in the video as you will likely need to use CrawlProcess to run the spider instead of a scrapy project. You will need to make sure you refactor your code correctly for jupyter notebook.

  • @luispedromachado6611
    @luispedromachado6611 10 หลายเดือนก่อน

    Do you know how to scrape linkedin profiles over 15k followers?

  • @khouloudsafi1445
    @khouloudsafi1445 ปีที่แล้ว

    Thank you very much for the tutorial.
    But it's not working for me ....
    When I "scrapy crawl linkedin_people_profile", I get the following error "KeyError: 'Spider not found: linkedin_people_profile'" 😢
    Eventhough I am following the exact same steps you did
    Could u help me with this, as it's urgent

    • @khouloudsafi1445
      @khouloudsafi1445 9 หลายเดือนก่อน

      Thank you I have solved the problem, it was just about changing the running directory cd

  • @JaneSheppard-z3s
    @JaneSheppard-z3s 2 ปีที่แล้ว +3

    Thank you for your tutorial videos.
    When I repeat your code, I have no problems.
    I'm sorry, but I want to ask about one problem that does not concern your code specifically,
    I was looking for answers on stackoverflow, but without success for now.
    If Scrapy spider closing after first request to start_urls every time in various situations,
    how to understand what exactly is not working correctly?

    • @scrapeops
      @scrapeops  2 ปีที่แล้ว +1

      If you have no errors in your console when you run your scraper, then it is likely you have set up your code not to make any subsequent requests after it finishes with the start requests.

  • @azhari7968
    @azhari7968 ปีที่แล้ว

    do you have to login to linkedin first

    • @scrapeops
      @scrapeops  ปีที่แล้ว

      No, the scraper only scrapes public pages so you don't need to login.

    • @azhari7968
      @azhari7968 ปีที่แล้ว

      @@scrapeops but it shows 999 status code. what does that mean. it's usually giving 200 status code

    • @tranngocduc5653
      @tranngocduc5653 ปีที่แล้ว

      @@azhari7968 i've got this error too, have you found any way to get around?

    • @azhari7968
      @azhari7968 ปีที่แล้ว

      @@tranngocduc5653 I gave up using scrapy. I use selenium and beautifulsoup instead

  • @inhhuy2473
    @inhhuy2473 ปีที่แล้ว

    I get 999 error

  • @attilasarkany6123
    @attilasarkany6123 ปีที่แล้ว +2

    In the user agreement of Linkedin, you are not allowed to scrape data as I remember. Is there any one who has experience, how much profiles you can scrape without being banned?

    • @scrapeops
      @scrapeops  ปีที่แล้ว +4

      LinkedIn don't want you to scrape their pages so that is why it makes it so hard.
      If you are just using 1 IP address then you will start getting blocked very quickly (10 pages could start getting your requests blocked if you are scraping fast)...They are unlikely to ban you completely that quickly, instead it will start flashing the login screen and stop showing you the data.

    • @xev
      @xev ปีที่แล้ว

      @@scrapeops So using ScrapeOps Proxy could help resolving this issue?

    • @scrapeops
      @scrapeops  ปีที่แล้ว +3

      @@xev Yes, we have proxy providers in our network that can scrape LinkedIn so if you use the proxy then you will be able to reliably scrape it.

    • @benjaminmathewson3111
      @benjaminmathewson3111 9 หลายเดือนก่อน

      If you're using a VPN and being sure not to log in, then you aren't explicitly violating the User Agreement/ Terms of Use. You cross the line when you log in and scrape from behind a user profile. If you are clever about this, you won't ever have to be concerned about getting banned.

    • @benjaminmathewson3111
      @benjaminmathewson3111 9 หลายเดือนก่อน

      If you're using a VPN and being sure not to log in, then you aren't explicitly violating the User Agreement/ Terms of Use. You cross the line when you log in and scrape from behind a user profile. If you are clever about this, you won't ever have to be concerned about getting banned.

  • @aspitola
    @aspitola ปีที่แล้ว

    The exact code provides the following response: 500 Internal Server Error. This didn't happen until last week.

    • @scrapeops
      @scrapeops  ปีที่แล้ว +1

      This was an issue with an underlying proxy provider that is being used when scraping Linkedin. It should be fixed now.

    • @aspitola
      @aspitola ปีที่แล้ว

      @@scrapeops Thank goodness. I'll try again tomorrow. Thank you 👍👍👍

    • @hansbaumberger4234
      @hansbaumberger4234 ปีที่แล้ว

      I am getting the same error. Is there a way around it?

    • @aspitola
      @aspitola ปีที่แล้ว

      @@hansbaumberger4234 contact them and wait. It took a few days but they eventually fixed this.

  • @hakiriimed7226
    @hakiriimed7226 ปีที่แล้ว

    The project run correctly but the json file is empty ? Thank you

    • @hansbaumberger4234
      @hansbaumberger4234 ปีที่แล้ว

      Having the same issue, could you solve it?

    • @zakariaboutrakouine5427
      @zakariaboutrakouine5427 ปีที่แล้ว

      @@hansbaumberger4234 i get the same problem please could you contact me if you find the answer ?

  • @jacopodigiacomantonio3103
    @jacopodigiacomantonio3103 ปีที่แล้ว +1

    is this legal?

    • @DevtalTalks
      @DevtalTalks ปีที่แล้ว +1

      in a gray area .. but it is what google and chatgpt do :)

  • @fintech1378
    @fintech1378 ปีที่แล้ว

    can you scrape posts?

    • @scrapeops
      @scrapeops  ปีที่แล้ว

      We have 2 other tutorials covering scraping company profiles and jobs. If you mean scraping the posts that are in the main feed when you are logged in - we do not do that as you need to be logged in and that is against their terms of service