EDIT: Elon Musk ruined the API part, you can use some other APIs Here's one of the subscriber built the same project using other APIs - chenmeiqiao.notion.site/How-I-Reach-Out-to-TH-camrs-I-Like-As-A-Data-Engineer-c37bdddefde54c3789229ffa5a789432 or you can use a static dataset from Kaggle and then use Airflow to process it www.kaggle.com/datasets/mmmarchetti/tweets-dataset FAQ: 1. Twitter removed free access: It still has free access but with limits of requests you can make - developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api 2. You need to request for V2 access: You will get an error but just by reading the doc you should be able to access Elevated Access Worked very hard for this project🤞 Don’t forget to hit the like button and if you want to support my work you can join channel membership, it's only 59rs per month so that I can keep these content for free
Mr. Darshil you have saved a lot of time for us by this video. This video has more content than others talkative videos of other youtubers. Thanku so much
This was a very helpful video for me. I have spent the last year learning web development, and I am expanding my skills to include ETL. This video gave me a great overview of the process.
Hi darshil you are doing a great job,have to request you to create some projects around Databricks and synapse in Azure. We all are waiting with desperation. Likewise AWS,Azure is also hitting popularity and companies ask more for Azure Data Engineers. I know you are very busy creating other playlists. But it's a humble request to create only 2-3 Data pipeline projects in Azure. Those who wants #Azure#Data Engineering. Please hit like so it came at top and Darshil got to know. Super excited for that series🌟⭐
I find it worthy to join a youtube channel membership for the first time. I hope with smart and harwork from myself and your help, I am able to land a Data Engineering job soon.
Those who are stuck in the api part can look for some other apis and perform ETL on it too and make your own project. Darshil helped us to understand the concepts therefore its not mandatory to use only Twitter api. Thanks a lot Darshil :)
Now with the arrival of ChatGPT, even though one can generate CODE without analyzing all these, it is always better to understand the logic, nuance and intricacies of CODING. This vid helps a lot in deconstructing this step-by-step.👌
At first, I tried on AWS Ubuntu t2.micro to run airflow - Gave error(Didnot execute Airflow Standalone) But Later, I tried t3.medium, It worked well. Thanks for the video!
this is important! i was getting errors "INFO - Triggerer's async thread was blocked for 0.23 seconds, likely by a badly-written trigger. Set PYTHONASYNCIODEBUG=1 to get more information on overrunning coroutines." upon googling, i saw many messages regarding resource constraints.
i had an issue Forbidden: 403 Forbidden 453 - You currently have access to a subset of Twitter API v2 endpoints and limited v1.1 endpoints (e.g. media post, oauth) only. If you need access to this endpoint, you may need a different access level. You can learn more here: when I tried to print(tweets) at 14:27 of this video I used a free account, so I just had 1 environment, do you know if this is the reason for the error? thanks
The same happened with me yestarday, I guess the Twitter team changed the API plan privileges where we can't pull/get tweets data on the basic/free plan
Darshil, thank you brother for the video. Very informative. I have a very curious question. So this tutorial is for Beginners. What are the components that will take for example the same project you created to an intermediary level and then an advance level? I would to hear your pov.
Thank you so much for making such a helpful video. you explained it very well. This is very helpful for beginners like me. will you please guide on how to install Airflow & Docker on windows system ?
when im running airflow standalone, it seem like running but i cant see the username and password, and of course I cant connect to PUBLIC IPV4 DNS, please help me, I cant find solution anywhere
There needs to be an update on this specific project because as per twitter no longer provides api with "elevated" access to read tweets, likes, comments etc . need an update with current api norms.
the api wouldn't let me request data by this code and many other codes it gives me an error 453 telling me I need to upgrade to v2.0 but I have a project created already and am authorized to request user timelines idk what is happening
Hey bro always liked your video !! do create more project based Data Engineering Videos. Would be nice if you would create something that will teach about docker.
Hi Darshil I have this error following your code: 453 - You currently have Essential access which includes access to Twitter API v2 endpoints only. If you need access to this endpoint, you’ll need to apply for Elevated access via the Developer Portal Thanks in advance
Hi Ariel, I'm also having this problem, basically you need to sign up to have access to Twitter API v1, you just have access to Twitter API V2 and the data of that API is different
Not able to connect to the airflow UI :/ Just get black screen in the browser and it seems to try to load forever. I had to install airflow in a virtual environment. The EC2 machine gave me an error if I tried to install it in the root.
Thank you so much for great content. i want to ask if i neend more columns from json data how can i know what are the keys and their associated values in json and under which key of json. How can i refine it. Thank you
Generally, you won't have to worry about the security part as a Data Engineer and I did not want to add extra complications to explain ports, etc... Mostly 8080 HTPP port should work but you will have to read about ports and request used to access Airflow and EC2
Where is it specified that the created CSV should be saved in the bucket? Is it some implicit functionally of AWS that the python code df.to_csv("filename.csv") results in a file being stored in the bucket? Can someone point me towards an explanation?
Hi There is no option of 'Elevated access' as of now. There is only Free, Basic and Enterprise. Basic is paid and Enterpirse is for a company. Is there any workaround this?
Thanks Dasrhil for the very educational video! I am running into a problem where the password when I execute 'airflow standalone' is not appearing. Did anyone face this challenge?
Nice video!! What if you have more than one DAG, when you change the airflow.cfg how do you include the other DAGS?? Another question..do you teach about airflow in your paid courses?
EDIT:
Elon Musk ruined the API part, you can use some other APIs
Here's one of the subscriber built the same project using other APIs - chenmeiqiao.notion.site/How-I-Reach-Out-to-TH-camrs-I-Like-As-A-Data-Engineer-c37bdddefde54c3789229ffa5a789432
or you can use a static dataset from Kaggle and then use Airflow to process it
www.kaggle.com/datasets/mmmarchetti/tweets-dataset
FAQ:
1. Twitter removed free access: It still has free access but with limits of requests you can make - developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api
2. You need to request for V2 access: You will get an error but just by reading the doc you should be able to access Elevated Access
Worked very hard for this project🤞
Don’t forget to hit the like button and if you want to support my work you can join channel membership, it's only 59rs per month so that I can keep these content for free
already did !!! appriciate your efforts brother !!
Hey! Are you using Elevated access level because Essential doesn't work with the code you are using
Amazing Darshil!! I was able to replicate the same in GCP
@@nishantagarwal8016 Awesome!
Create linkedin post and tag me
@@Soulfulreader786 There is a free version also
this is what is called content and actual do something for community ..rather than just doing faltu things...thank u so much Darshil ❤️
Thank you Sumanta
_"faltu things"_ 😅
Faulty as taking other people interviews 😂😂
@@DarshilParmarredo the project with Twitter api!!!!!! now!!!!!!
In data engineering community you are the GEM. Thanks from the bottom of my heart Darshil. Keep growing keep inspiring.
Mr. Darshil you have saved a lot of time for us by this video. This video has more content than others talkative videos of other youtubers. Thanku so much
This was a very helpful video for me. I have spent the last year learning web development, and I am expanding my skills to include ETL. This video gave me a great overview of the process.
Hi darshil you are doing a great job,have to request you to create some projects around Databricks and synapse in Azure. We all are waiting with desperation. Likewise AWS,Azure is also hitting popularity and companies ask more for Azure Data Engineers. I know you are very busy creating other playlists. But it's a humble request to create only 2-3 Data pipeline projects in Azure. Those who wants #Azure#Data Engineering. Please hit like so it came at top and Darshil got to know.
Super excited for that series🌟⭐
Yes I see azure in demand.
@@prabhatgupta6415 Yes prabhat in early level of carrer from 1-5 Yrs of experience, Azure is famous.
Twitter stopped free access to their APIs :|
one of the best tutorials I've ever seen on TH-cam, a real-world example that was really interesting
Thanku darshil for making this project, please make this type of project on regular basis so that it will help us to enhance our skill
Thank you Aarav, I’ll try my best
I find it worthy to join a youtube channel membership for the first time. I hope with smart and harwork from myself and your help, I am able to land a Data Engineering job soon.
Bro I am unable to get access to Twitter API v2. Could you please help me out with that part?
Those who are stuck in the api part can look for some other apis and perform ETL on it too and make your own project.
Darshil helped us to understand the concepts therefore its not mandatory to use only Twitter api.
Thanks a lot Darshil :)
What other APIs are a good place to start other than the Twitter/X API used in the video?
@@snehalsylasmalladi9320 you can search for rapid api or search for open source apis
Now with the arrival of ChatGPT, even though one can generate CODE without analyzing all these, it is always better to understand the logic, nuance and intricacies of CODING. This vid helps a lot in deconstructing this step-by-step.👌
At first, I tried on AWS Ubuntu t2.micro to run airflow - Gave error(Didnot execute Airflow Standalone)
But Later, I tried t3.medium, It worked well. Thanks for the video!
Awesome, congratulations
@@DarshilParmar Can u please also create a detail project video that includes Kinesis stream, EMR (PySpark) and Lake Formation or DynomoDB, Please!
I've been breaking my head over it. T2.micro is not supporting right?
How much did it cost you when you used t3.medium?
I guess, it is because airflow requires min 4GB memory to run, t2.micro is not sufficient
Wowww., Wonderful explanation.., never before ever after...
True master for Big Data ,🙂Darshil
Thank you so much 😀
i m elder than u.. but u r an inspiration bro... i m new to data engineering..
Thanks for creating this project, with the help of this i schedule my data extraction task ,currently i am using solcast API to fetch weather data .
Darshillllllll this is Gold. I literally love you rn 😭
This is really Amazing Darshil. I would also like to see Architecture level videos and how all tools all integrated into the cluster.
Great videos, Darshil! Also a side note, I often watch TH-cam videos at 1.5. Yours feel faster even at a normal pace. :)
My natural talking pace is faster, many people complained about it but I can’t help it
Wish i found you earlier. i am learning a lot from you 🙂, recommended your channel to all my colleagues that are in data field
Awesome, thank you!
this is awsome !
can't wait for the next data engineering projects, darshil🔥
greetings from indonesia
Please keep making more videos like this!
I will try my best
Did something change with the API? I get a 403 forbidden error when I try to run the tweepy.user_timeline function.
Thanks - very comprehensive tutorial
With user-timeline deprecated it would make sense to update the video with V2 ApI
Hi Darshil, Thank you for this .It was a great learning experience and it was fun too ! 😛.I am eagerly waiting for more such videos on airflow
Darshil, amazing! I do not have words to say thank yoU!
Real content is finally here ❤ Loved the tutorial.
Bro I am unable to get access to Twitter API v2. Could you please help me out with that part?
Keep up the good work! Your project is the best. Greetings from Chile 😁
the airflow standalone command get stuck in (paralelisme 1 ) idk but may there is something doesn't work in t2.micro
Thanks for teaching basics of air flow and Dags..
Great explanation! Will try to replicate in GCP
Awsome brother...Loved the way to teach...Hoping for detailed projects.....thanks a ton brother.....
Present twitter free keys are not providing any details,we have to buy subscriptions which is 8000 per month
THANK YOU SO MUCH FOR SUCHA QUALITY CONTENT ......GOD BLESS YOU
Hi Darshal, good stuff, you should make a video for dbt i'm trying forever to connect it to bigquery and learn how to use things.
Eagerly waiting for this project. Thank you darshil for such amazing projects...
You are welcome
Hi Can you show one on how to do it for SQL instead of Amazon
Important thing to note:
Apache Airflow requires min. 4GB of memory hence free trial EC2 instance won't work.
this is important! i was getting errors "INFO - Triggerer's async thread was blocked for 0.23 seconds, likely by a badly-written trigger. Set PYTHONASYNCIODEBUG=1 to get more information on overrunning coroutines." upon googling, i saw many messages regarding resource constraints.
Why can’t we directly write the csv file to s3 and not use airflow at all.
I cannot see any usecase of airflow?
Amazing...got to know something intresting. Thanks for the detail explanation.
i had an issue Forbidden: 403 Forbidden
453 - You currently have access to a subset of Twitter API v2 endpoints and limited v1.1 endpoints (e.g. media post, oauth) only. If you need access to this endpoint, you may need a different access level. You can learn more here: when I tried to print(tweets) at 14:27 of this video
I used a free account, so I just had 1 environment, do you know if this is the reason for the error? thanks
The same happened with me yestarday, I guess the Twitter team changed the API plan privileges where we can't pull/get tweets data on the basic/free plan
Even I am getting the same error. Was it resolved for you?
Will this work with "X" ?
Awesome demonstration, truly appreciated l, waiting to see lot more soon
Appreciate the content. Cleared the basics by watching it.
What about execution?
@@DarshilParmar Not working
Darshil, thank you brother for the video. Very informative. I have a very curious question. So this tutorial is for Beginners. What are the components that will take for example the same project you created to an intermediary level and then an advance level? I would to hear your pov.
Thank you so much for making such a helpful video. you explained it very well. This is very helpful for beginners like me.
will you please guide on how to install Airflow & Docker on windows system ?
Can't access twitter developer account due to its service operator not working :(
when im running airflow standalone, it seem like running but i cant see the username and password, and of course I cant connect to PUBLIC IPV4 DNS, please help me, I cant find solution anywhere
Mentor of many data engineer ❤
There needs to be an update on this specific project because as per twitter no longer provides api with "elevated" access to read tweets, likes, comments etc . need an update with current api norms.
the api wouldn't let me request data by this code and many other codes it gives me an error 453 telling me I need to upgrade to v2.0 but I have a project created already and am authorized to request user timelines idk what is happening
Note that Airflow need atleast 2GB RAM, hence cannot be installed on free tier.
Great darshil this is really helpful video
Thank you.. keep uploading content like this please
Thank you Tejas for always supporting
Recent changes to twitter API i think .... Apply for elevated Access if you run into issues. By the way Great content Darshil
True twitter developer is charging money to access their api.
@@alwayssporty8102 csv file with tweets r generating for me ! airflow standalone cmd is not working acc to project ! any idea?
Seriously thanks dudeee❤
Please make a tutorial series on Airflow
I will try to work on this
Hey bro always liked your video !! do create more project based Data Engineering Videos. Would be nice if you would create something that will teach about docker.
Sure I will
For those facing bad file permission error while connecting to instance, change file permission for the key, using chmod 400 filename on terminal
thanks for sharing info - A step by step guide
Now this is some real world step , rather than "intro to numpy and pands"
great work you helping lot of people.
Thanks Darshil for the project ! Loved it ! Can you suggest topics/ideas for more projects so that we can do it on our own ?
I already have video on this, please check my channel
Hi Darshil I have this error following your code:
453 - You currently have Essential access which includes access to Twitter API v2 endpoints only. If you need access to this endpoint, you’ll need to apply for Elevated access via the Developer Portal
Thanks in advance
Hi! In your Twitter Development Portal, click in Products on left side menu, go to Elevated tab e make your request to Elevated access
@@fabiomarquez7596 Thanks Fabio.
Hi Ariel, I'm also having this problem, basically you need to sign up to have access to Twitter API v1,
you just have access to Twitter API V2 and the data of that API is different
Even I am getting the same error
Can you please tell me how did you resolve it?
I feel like in 12:29 the access_key is your API key , and consumer key is your Access key isnt it?
Thanks and keep doing more... Please also make python part 2 ..
Instead of forcing the airflow webserver to stop, you can just try resetting DB in a different terminal and it will work too.
Thanks for the feedback Siddhesh
Not able to connect to the airflow UI :/
Just get black screen in the browser and it seems to try to load forever.
I had to install airflow in a virtual environment. The EC2 machine gave me an error if I tried to install it in the root.
Seems like you cannot run airflow on the free tier t2.micro with only 1 GB of RAM?
@@KasperBirkelund same 4 t3.micro :c
Great explanation. Thanks, Darshil! 😊
Glad you liked it!
Have you ever used Argo in place of Airflow?
Very clean and understandable 🎉
Error: Apply for elevated access via the developer portal.
Is Twitter now charging the users for elevating their API access level?
same error , has anyone solved this ?
Same here, has anyone solved this?
ya now there is no elevated access
Very crystal clear 🔮 explanation 🎉
Bro I am unable to get access to Twitter API v2. Could you please help me out with that part?
It would be possible to put all of these sudo commands into the User Data, so it runs when the Instance is started?
I created the instance but not able to connect to 8080 port.
Did everything mentioned in the video
clear and concise content!! Great!
do you provide any demo sessions before enrolling in the course?
Thank you so much for great content. i want to ask if i neend more columns from json data how can i know what are the keys and their associated values in json and under which key of json. How can i refine it. Thank you
cool man, just what I needed
I have install docker desktop, every thing is running fine , but in .py file i receive error on "from airflow import DAG"... can you suggest ?
So what should we do if not allow all trafic to make it work ? Because you said it is a bad practice, but not what was the good practice ^^
Generally, you won't have to worry about the security part as a Data Engineer and I did not want to add extra complications to explain ports, etc...
Mostly 8080 HTPP port should work but you will have to read about ports and request used to access Airflow and EC2
Hi @darshil
will i get charged ? if i use AWS EC2 & S3 bucket in free tier , because i have heard people getting bills even on free tier aws account ?
Where is it specified that the created CSV should be saved in the bucket? Is it some implicit functionally of AWS that the python code df.to_csv("filename.csv") results in a file being stored in the bucket? Can someone point me towards an explanation?
Hi
There is no option of 'Elevated access' as of now. There is only Free, Basic and Enterprise. Basic is paid and Enterpirse is for a company. Is there any workaround this?
Bro literally the same error I am getting, I am unable to get access to Twitter API v2. Could you please help me out with that part?
@@rohitpandey9920 no workaround still :(
Why you did manual trigger?
Please make a series of apache spark for data engineering for beginners
I have a in-depth course on this with 2 end to end projects here - datavidhya.com/courses/apache
@@DarshilParmar ok will check
Great work!
amazing as usual
Thanks Dasrhil for the very educational video!
I am running into a problem where the password when I execute 'airflow standalone' is not appearing.
Did anyone face this challenge?
Informative.
Here airflow is running on single instance, if load increase it may fail right !. And do we have any tool like Azure Adf?
@DarshilParmar if the data is flowing continuously into S3 , the new added data will be appended to existing file or table or it creates a new file?
Nice video!! What if you have more than one DAG, when you change the airflow.cfg how do you include the other DAGS?? Another question..do you teach about airflow in your paid courses?
I have a plans to work on Airflow Course
Make more on airflow and end to end data pipeline. Great effort :)
I’ll try my best to work on more
can u help me with airflow standalone it is not working
excellent, thanks for the content
Thank you for everything it helps me so much
Thank you for support Karthik
I started doing this project, but unable to start airflow using command "airflow standalone"
Please Make more videos on apache airflow Dag
Very informative👍
You need to elevate your API Access from essential to elevated