HowToDataScience : Scraping Twitter Data

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ม.ค. 2025

ความคิดเห็น •

  • @Ighodalo_
    @Ighodalo_ 3 ปีที่แล้ว

    Hey Ritvikmath, a friend recommended your channel a few weeks back and it has been not short of excellent. Thanks for putting in this amount of work to explain data science and machine learning concepts in the most simple and plain way possible.
    God bless you real good.

  • @tiagomelo2288
    @tiagomelo2288 5 ปีที่แล้ว +9

    Great video! Short, concise and straight to the point. However, to anyone having some sort of issue, you should switch the 'wb' when opening the .csv file to just 'w', since that will cause your program to expect a bytes-like object. I don't know how you got your code to work like that

    • @satyajitsaha3008
      @satyajitsaha3008 4 ปีที่แล้ว +1

      Many thanks Tiago for providing the solution. I got stuck there. However this problem lies in python 3.5 but older version shall work fine with given code.

  • @blownspeakersss
    @blownspeakersss 6 ปีที่แล้ว +12

    Note for Python 3 users: don't use the .encode() method on the text. Rather, just use the following parameters in the open function: open(filename, 'w', encoding='utf8').
    Was losing my mind trying to figure out why I was getting bytes in my csv file.

    • @pewpew7108
      @pewpew7108 6 ปีที่แล้ว

      Thanks kid

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว

      Thank you :)

    • @ismailaitzaid4300
      @ismailaitzaid4300 5 ปีที่แล้ว

      @blownspeakersss sounds that I'm the one who wrote that comment from the future dude , I also been tryinng for 2 days before coming up with this solution. :)

    • @maheshable100
      @maheshable100 5 ปีที่แล้ว +1

      where should right this code

    • @rifqifauzan8468
      @rifqifauzan8468 2 ปีที่แล้ว

      @@ritvikmath NameError: name 'raw_input' is not defined

  • @fredianriko5648
    @fredianriko5648 5 ปีที่แล้ว +5

    thank you sir for providing this video... it helps me so much on my final project at university

  • @syedalihassan4156
    @syedalihassan4156 3 ปีที่แล้ว +3

    Great, can you or anyone tell me how to fetch using hashtags gor specific time period? Like I am planning to fetch from 12th May 2019 to 12th July 2019.

  • @shirososano4333
    @shirososano4333 4 ปีที่แล้ว +1

    Thank you ,i did some modifications and it worked !!

  • @laurarenwick943
    @laurarenwick943 4 ปีที่แล้ว +1

    This was super helpful. Thanks for this!

  • @dwititipbeliin8008
    @dwititipbeliin8008 5 ปีที่แล้ว +4

    Thanks for the great video. How would you change the date times so we can scrap "keyword" in 2019 maybe ?

  • @luj2893
    @luj2893 5 ปีที่แล้ว +7

    can anyone help me? In my jupter notebook, i got a type error:
    TypeError: a bytes-like object is required, not 'str'
    why is that?

    • @pratikpatil0830
      @pratikpatil0830 5 ปีที่แล้ว +1

      same problem

    • @shivanshsingh2869
      @shivanshsingh2869 5 ปีที่แล้ว +6

      Replace this Line of ritvikmath: with open('%s.csv'%(spreadsheetName),'wb') as file:
      with mine code
      with open('hastag%s.csv'%(spreadsheetName),'w') as file:

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว +1

      thanks for helping :)

    • @kunalbambardekar4739
      @kunalbambardekar4739 4 ปีที่แล้ว

      @@shivanshsingh2869 hey, it gives the same error.

  • @awaisawan1276
    @awaisawan1276 5 ปีที่แล้ว +4

    sir can you make complete detailed video how to extract tweets with images on twitter thanks alot

  • @KijuMr
    @KijuMr 3 ปีที่แล้ว +1

    Hi, for some reason it creates only 200 rows of data but I ask for 1000 for example. Whats the reason behind it ?

  • @myjonpol
    @myjonpol 4 ปีที่แล้ว +4

    Hi, thank you for the tutorial.
    What if I want to limit the extracted tweets from a specific location (ie Country)?

    • @palakagarwal3764
      @palakagarwal3764 4 ปีที่แล้ว +2

      did you find how to do this ? can you please share ?

    • @myjonpol
      @myjonpol 4 ปีที่แล้ว

      @@palakagarwal3764 you need to use Stream.filter using bounding box coordinates as parameters

    • @mariag6111
      @mariag6111 4 ปีที่แล้ว

      @@myjonpol hii. Im trying to find tweets from a specific town, but its not really working for me. It would be awesome if you may explain how you did that aww ^_^

  • @julianzhai6321
    @julianzhai6321 4 ปีที่แล้ว +1

    Hi, can you make another video showing how to export tweets by user instead of by hashtag? Thanks

  • @abinandhans5386
    @abinandhans5386 2 ปีที่แล้ว

    super vroo.
    romba thx bro

  • @kritikhetan
    @kritikhetan 4 ปีที่แล้ว +1

    while trying to create the account I am getting the error "You'll need an app and API key in order to authenticate and integrate with most Twitter developer products. Create an app to get your API key."

  • @fredianriko5648
    @fredianriko5648 5 ปีที่แล้ว +2

    in my jupyter notebook, i get an error
    NameError: name 'raw_input' is not defined
    why is it?

    • @MultiCovalski
      @MultiCovalski 5 ปีที่แล้ว +3

      replace raw_input() with input()

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว +1

      thanks for helping!

  • @dhanyamj5292
    @dhanyamj5292 4 ปีที่แล้ว +1

    @ritvikmath I am getting "bytes-like object is required, not 'str' " error

    • @minoo1984
      @minoo1984 4 ปีที่แล้ว

      Same, please help!

    • @skarthik5459
      @skarthik5459 3 ปีที่แล้ว +1

      Hi!, Change this line of code " with open('%s.csv' % (fname), 'wb') as file: " to " with open('%s.csv' % (fname), 'w') as file: ", By doing this you will be able to input string data rather than binary data

  • @aniketkambli
    @aniketkambli 5 ปีที่แล้ว

    great work man keep it up

  • @thomaskersig5291
    @thomaskersig5291 5 ปีที่แล้ว +1

    Thanks for sharing this! When I set the number of items to 100,000, I only end up with e.g. 82 tweets for a particular, yet popular hashtag. For others I get more or even less. How comes? How can I work my way around this to get the desired number of tweets?

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว +1

      I feel that lately popular social media sites have been throttling the amount of data you can obtain for free :\

  • @Leo-zj2qp
    @Leo-zj2qp 4 ปีที่แล้ว

    Beautiful, thanks!

  • @inss8808
    @inss8808 5 ปีที่แล้ว +2

    Any help with this?
    ModuleNotFoundError Traceback (most recent call last)
    in
    1 import json
    2 import csv
    ----> 3 import tweepy
    4 import re
    5 """
    ModuleNotFoundError: No module named 'tweepy'

    • @HarshSingh-ug7bw
      @HarshSingh-ug7bw 5 ปีที่แล้ว

      try installing tweepy... by executing 'pip install tweepy' in your command line.

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว

      Thanks for helping!

  • @ritvikdayal5970
    @ritvikdayal5970 5 ปีที่แล้ว

    just dropping to introduce Ritvik to Ritvik
    HELLOO!!!!

  • @joaquinmorenoantuna9573
    @joaquinmorenoantuna9573 5 ปีที่แล้ว

    Hi ritvikmath! Thanks for the video and code. Is there a way to segment country and age of the users?

  • @asyakur_
    @asyakur_ 4 ปีที่แล้ว +1

    please help. I need to get data with a certain date range, I have tried using SINCE and UNTIL but it doesn't work

    • @FriendlySpiderFriend
      @FriendlySpiderFriend 4 ปีที่แล้ว

      Abdus Syakur try using an "if" statement. so.. if the date is within the given range, add it to the list

  • @bharadwajchivukula2945
    @bharadwajchivukula2945 6 ปีที่แล้ว

    thanks a lot for this video great work!!

  • @ayeshakhan2233
    @ayeshakhan2233 4 ปีที่แล้ว +1

    'charmap' codec can't encode character '\u30fc' in position 379: character maps to
    getting this error can someone help?

    • @RodrigueTchamna
      @RodrigueTchamna 4 ปีที่แล้ว

      Hello dear Ayesha. I am having same problem here. Did you solve it? If so, how?
      'charmap' codec can't encode character

    • @ayeshakhan2233
      @ayeshakhan2233 4 ปีที่แล้ว +3

      @@RodrigueTchamna just add encoding='utf-8' to with open('hastag%s.csv' %(fname), 'w+', encoding='utf-8')

  • @bthapa94
    @bthapa94 3 ปีที่แล้ว

    How do you alter the code to see all hashtags from 2020-Jan-1 to 2020-Dec-31 for example?

  • @davidsonvalera2538
    @davidsonvalera2538 4 ปีที่แล้ว +1

    please help i had an error "NameError: name 'search_for_hashtags' is not defined"

    • @tommosby744
      @tommosby744 4 ปีที่แล้ว

      make sure
      def search_for_hashtags(consumer_key, consumer_secret, access_token, access_token_secret, hashtag_phrase):
      is written as above. (Case sensative) That error indicates that python can't find a defined field as search_for_hashtags'

  • @elluranands6904
    @elluranands6904 ปีที่แล้ว

    any idea, what argument I need to use in the function to download tweets from a particular country only like 'India' only.

  • @burakoglakc1286
    @burakoglakc1286 6 ปีที่แล้ว +1

    thank you so much this tutorial, but it would be better if location search is added...

  • @uxdesigner2040
    @uxdesigner2040 4 ปีที่แล้ว +1

    How do you adapt it to Python 3.8.

  • @Maverick_8457
    @Maverick_8457 4 ปีที่แล้ว +1

    hello ,
    i am getting error that " raw_imput " cannot be found !
    Where did u write code for that ?

  • @vitoriaaraujo2012
    @vitoriaaraujo2012 5 ปีที่แล้ว +3

    Didn't work for me :/

  • @jowwwman
    @jowwwman 5 ปีที่แล้ว +1

    Thanks for the video! Any way we can scrape using the tweet text instead of the hashtag?

    • @tiagomelo2288
      @tiagomelo2288 5 ปีที่แล้ว

      You already are doing that. If you take a look at your .csv file, most of the tweets don't have the hashtag you were looking for

  • @mercymuenimwangi
    @mercymuenimwangi 5 ปีที่แล้ว +2

    How can I obtain the data based on a country

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว

      Thanks for the question, maybe this documentation can help you get started: developer.twitter.com/en/docs/tutorials/filtering-tweets-by-location

  • @Malak-jt5oj
    @Malak-jt5oj 2 ปีที่แล้ว

    every thing is working but : the CSV file is empty I trird many word but they are empyt

  • @abhisheksharma6617
    @abhisheksharma6617 6 ปีที่แล้ว +2

    Could it be this easily done in R?

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว

      I'm sure there would be a way to get this same result with R

  • @rohitbhattacharya8336
    @rohitbhattacharya8336 4 ปีที่แล้ว +1

    I want to extract 100000 tweets for a particular month, how to do that?

  • @cocoarecords
    @cocoarecords 4 ปีที่แล้ว

    awesome stuff

  • @jonathanfabish8032
    @jonathanfabish8032 6 ปีที่แล้ว

    Awesome. Thanks!

  • @johncorcoran8666
    @johncorcoran8666 5 ปีที่แล้ว

    Thanks for the great video. How would you change the code to search for tweets that contain Aquaman & DC Comics in the text of the tweet?

    • @ritvikmath
      @ritvikmath  5 ปีที่แล้ว +1

      Great question. Here is a reference from Twitter for all the cool syntax tricks you can use to get exactly what you want: developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators.html

  • @hirojandro335
    @hirojandro335 4 ปีที่แล้ว +1

    I need your help how can I get the data tweets for specific date of time, I tried your code and working with me, Need your help badly

    • @kaushikkash9108
      @kaushikkash9108 4 ปีที่แล้ว +1

      hey did u figure it out? I am stuck at it too!!

    • @hirojandro335
      @hirojandro335 4 ปีที่แล้ว

      @@kaushikkash9108 Still not

  • @kaushalkumar3017
    @kaushalkumar3017 4 ปีที่แล้ว

    hi ritivik , How can we get year 2010 to year 2020 filtered with keyword tweets?

  • @kangjayden15
    @kangjayden15 2 ปีที่แล้ว

    sorry for the late comment but can this be done on pycharm?

  • @utkarshrajeshsingh7199
    @utkarshrajeshsingh7199 5 ปีที่แล้ว

    I am facing issue with the code please as that is giving error as:
    TypeError: a bytes-like object is required, not 'str'

    • @dwititipbeliin8008
      @dwititipbeliin8008 5 ปีที่แล้ว +3

      you should switch the 'wb' when opening the .csv file to just 'w'

    • @rahultakle12
      @rahultakle12 4 ปีที่แล้ว

      @@dwititipbeliin8008 @ You are great . i am learning python and you explain very easy . your code is working perfectly

  • @symnshah
    @symnshah 4 ปีที่แล้ว

    Nice video. I am going to subscribe to your channel.

  • @Malimy
    @Malimy 5 ปีที่แล้ว

    Thank you so much!

  • @user-bo3mt2dk6o
    @user-bo3mt2dk6o 5 ปีที่แล้ว

    I managed to get it to run and download the data with the helpful suggestions from the comments, thanks! BUT: in my output I don't get separate columns for date, text or hashtags. Every tweet is stored in one row without columns unlike in the output you show in the video. I don't know why. My first row reads timestamp,tweet_text,username,all_hashtags,followers_count, so they are not being separated into columns. Does someone know how to fix this? Thanks in advance!

    • @dwititipbeliin8008
      @dwititipbeliin8008 5 ปีที่แล้ว

      when you write row make sure use ' ' and , every row you need

  • @farestabib6858
    @farestabib6858 2 ปีที่แล้ว

    the link Create new twitter app is not working any solutions ????

  • @shabu1649
    @shabu1649 4 ปีที่แล้ว

    I have UnicodeEncodeError , i see lang = "en" it didnt get only en text ? i got greek and can not encode 0....0

  • @thank_q
    @thank_q 6 ปีที่แล้ว +2

    I tried searching for hashtags that are in Korean, and I got 'utf-8' codec can't decode byte 0x98 in position 0: invalid start byte'. How can I fix this?

  • @dhanashreedeshpande7100
    @dhanashreedeshpande7100 4 ปีที่แล้ว

    If I want to fetch profile data of a user whom I am visiting on Twitter, then how we can fetch it without our login? Please tell

  • @aamirsohail8385
    @aamirsohail8385 5 ปีที่แล้ว

    where i have to put consumer key etc i cant understand

  • @paocommanteiga3468
    @paocommanteiga3468 5 ปีที่แล้ว

    The hashtag phrase is a string?

  • @nikhithasagarreddy
    @nikhithasagarreddy 4 ปีที่แล้ว

    Sir how to seperate tweets based on States

  • @selimcanpolat8664
    @selimcanpolat8664 4 ปีที่แล้ว

    do i need a developer account to create an application?

  • @rabiemanzoor6880
    @rabiemanzoor6880 5 ปีที่แล้ว

    amazing work done(Y)

  • @sumitkaushik3113
    @sumitkaushik3113 4 ปีที่แล้ว

    I am getting tweep Error with status code =400. Wat does it mean?

  • @rajapeddireddy
    @rajapeddireddy 5 ปีที่แล้ว

    May I Know How To Analyze Instagram Data Using Big data Ecosystem.....?

  • @TheKinesiologist1
    @TheKinesiologist1 5 ปีที่แล้ว

    NameError: name 'raw_input' is not defined any hints?

    • @ramsonyj
      @ramsonyj 5 ปีที่แล้ว

      Me nether!

    • @valynejuma3649
      @valynejuma3649 4 ปีที่แล้ว

      @@greendayacdcify hi I'm getting name error:name 'search_for_hashtags'not defined how do I fix it

    • @valynejuma3649
      @valynejuma3649 4 ปีที่แล้ว

      @@greendayacdcify I used the exact same code used in this program everything is the same

  • @290uae
    @290uae 5 ปีที่แล้ว

    help please I got this error
    NameErrorTraceback (most recent call last)
    in ()
    7
    8 if __name__ == '__main__':
    ----> 9 search_for_hashtags(consumer_key, consumer_secret, access_token, access_token_secret, hashtag_phrase)
    NameError: name 'search_for_hashtags' is not defined

    • @rahuldave625
      @rahuldave625 5 ปีที่แล้ว

      did you find a solution to this?

  • @aamirsohail8385
    @aamirsohail8385 5 ปีที่แล้ว

    when i put C:\users\Aamir\Anaconda3 it says syntax error how to remove this error

    • @oyedeepak
      @oyedeepak 5 ปีที่แล้ว

      Try '/' instead of '\'
      Maybe it will work

  • @oyputra
    @oyputra 3 ปีที่แล้ว

    NameError Traceback (most recent call last)
    in
    ----> 1 consumer_key = raw_input("Consumer Key ")
    2 consumer_secret = raw_input('Consumer Secret ')
    3 access_token = raw_input('Access Token ')
    4 access_token_secret = raw_input('Access Token Secret ')
    5
    NameError: name 'raw_input' is not defined

    • @oyputra
      @oyputra 3 ปีที่แล้ว +3

      replace raw_input() with input()

    • @ricksanchez3c273
      @ricksanchez3c273 3 ปีที่แล้ว +1

      Thank you!

  • @nesrinebouazizi8442
    @nesrinebouazizi8442 4 ปีที่แล้ว

    It does not work if i put two hashtags like #cocacola AND #france don't get why help pleaaaaaase !!!!!!!!!!!!!!!!!!!!!!!

  • @mochamadsalimubaidillah5977
    @mochamadsalimubaidillah5977 4 ปีที่แล้ว

    very veryy help me.. thx you so much

  • @nos5903
    @nos5903 5 ปีที่แล้ว

    I am trying to collect over 2 lakhs tweets with tag #sarcasm but the program outputs only upto 2647 tweets plz help

    • @kaushikkash9108
      @kaushikkash9108 4 ปีที่แล้ว

      This is bcoz there is some rate limits to extract data! refer this document and see how much data u can extract based on your API section developer.twitter.com/en/docs/rate-limits

    • @nos5903
      @nos5903 4 ปีที่แล้ว +1

      @@kaushikkash9108 bhai hogaya project banakar and this Corona Shit stopped the final Exams, Thanks btw to reply.

  • @dorothykabarozi
    @dorothykabarozi 4 ปีที่แล้ว

    Totally failed to installed tweepy am using python 3.7.2

    • @afirrk
      @afirrk 4 ปีที่แล้ว

      did you try pip install tweepy, or pip3 install tweepy from your terminal?

  • @daltonlimrentak6784
    @daltonlimrentak6784 3 ปีที่แล้ว

    TweepError: Twitter error response: status code = 401

  • @vandornjunior7572
    @vandornjunior7572 4 ปีที่แล้ว +1

    "2:58""twitter automation software
    whitehatbox.blogspot.com/2017/08/whitehatbox-softwares-including.html" Caterina Maria Romula di Lorenzo de' Medici, nota semplicemente come Caterina de' Medici (Firenze, 13 aprile 1519 - Blois, 5 gennaio 1589), fu regina consorte di Francia, come moglie di Enrico II,

  • @muncle118
    @muncle118 4 ปีที่แล้ว

    grazie all'aiuto di *CYBER_NEWTON001* su instagram il mio account sospeso è tornato è molto eccellente👍👍👍

  • @muncle118
    @muncle118 4 ปีที่แล้ว

    grazie all'aiuto di *CYBER_NEWTON001* su instagram il mio account sospeso è tornato è molto eccellente👍👍👍

  • @muncle118
    @muncle118 4 ปีที่แล้ว

    grazie all'aiuto di *CYBER_NEWTON001* su instagram il mio account sospeso è tornato è molto eccellente👍👍👍