Netflix Data Cleaning and Analysis Project | End to End Data Engineering Project (SQL + Python)

Ankit Bansal

มุมมอง 47 450

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 พ.ย. 2024

ความคิดเห็น • 98

@ankitbansal6 5 หลายเดือนก่อน ⁺¹⁸
Please like the video as it takes a lot of effort to record a video of more than 1 hour. It will motivate me to create more long form videos.
GitHub and all related links in the the description box. Thanks for watching !!!
@simplytech4u898 5 หลายเดือนก่อน ⁺¹
Thank you Ankit this is really amzing .. once started and finished in one go...
@Hope-xb5jv 5 หลายเดือนก่อน ⁺¹
10:22 Try many times but not get korean name in sql database
i created a table and put insert also but it shows only ????
now i surrender😒
@kumarsumit6117 5 หลายเดือนก่อน
Use nvarchar
@govindpillay1309 หลายเดือนก่อน
@@Hope-xb5jv Hey you can fix the issue by changing the database collation which supports UTF-8 (Unicode characters). I was also stuck with same issue, but after creating the database with UTF-8 supported collation the issue got fixed.
@rahulrachhoya2716 5 หลายเดือนก่อน ⁺²
Thanks so much @Ankit this valueable video for me. I have an interview with red hat in up coming 3 days as an associate data analysts. I learn lot from your Videos. You are litterly SQL king because you write in very simple manner so that every one can understand . You are my mentor with your videos I am able to solve questions like you . Salute you @Ankit 😎😎😎
@saikanth447 5 หลายเดือนก่อน
@rahulrachhoya2716 I have seen career portal, no such DA role, can you help me for the same, as we are on the same boat, thanks in advance .
@pavitrashailaja850 5 หลายเดือนก่อน ⁺³
Great effort in putting the whole project together 🤟🏻
@ankitbansal6 5 หลายเดือนก่อน ⁺¹
Thanks a ton!
@shahrizwan818 2 หลายเดือนก่อน
nice work sir Thanks for uploading ,please upload more videos like this
@manishpal2937 5 หลายเดือนก่อน
thanks Ankit, the effort you put in your lectures is admirable, learned a lot of new things today from this video 💌
@ankitbansal6 5 หลายเดือนก่อน
My pleasure 😊
@vijayakanthanannamlai 5 หลายเดือนก่อน ⁺¹
love it Ankit... what an effort
@msk-pl3hw 5 หลายเดือนก่อน ⁺¹
It was a really nice project. Had a good hands on in sql.
@ankitbansal6 5 หลายเดือนก่อน ⁺¹
Great 😊
@livelovelaugh4050 5 หลายเดือนก่อน ⁺¹
Thank you so much Sir 🙏 . Thank you for giving hope for people like me . Keep inspiring ✨
@ankitbansal6 5 หลายเดือนก่อน ⁺¹
It's my pleasure
@MiteshYadav 4 หลายเดือนก่อน ⁺¹
Awesome, can we have series on Python from basics that can be useful for analysis..
@anudipray4492 2 หลายเดือนก่อน
Hi Ankit,
It very good for learner
@ishmeenkaur8299 5 หลายเดือนก่อน
really good work, easy to understand.
@ritu-pf1jy 5 หลายเดือนก่อน ⁺¹
Great efforts sir
@manuelmelendezebrat165 หลายเดือนก่อน
Hi Ankit, min 37:55 when we create Netflix clean table, you delete the WHERE clause rn =1, and then create the table. However was not this clause the one who keep the table away from the duplicated titles?
If we execute :
SELECT title,count(*) from netflix group by title HAVING COUNT(*) > 1 we will find the same duplicates.
I think we should create the table with the WHERE clause. Maybe I'm making a mistake. Thanks for you effort.
@saikatofficial420 5 หลายเดือนก่อน ⁺¹
Thanks a lot sir for this valuable project.Can you please make a video on cross apply . I have watched your SQL course didn't find it .
@eemayo5889 5 หลายเดือนก่อน ⁺¹
Thanks a lot. Could you please show how to download data from API?
Great content btw.
@ankitbansal6 5 หลายเดือนก่อน ⁺²
Check the first part of this video
th-cam.com/video/uL0-6kfiH3g/w-d-xo.html
@Vivek2495646 13 วันที่ผ่านมา
Great content thank you
@ankitbansal6 13 วันที่ผ่านมา
Glad you enjoyed it!
@Random_World_ 5 หลายเดือนก่อน ⁺¹
Thanks for this project
@ankitbansal6 5 หลายเดือนก่อน ⁺¹
My pleasure
@austinmkruahsr.615 5 หลายเดือนก่อน
This is wonderful, can I use this same method for postgresql? Please help me...
@ankitbansal6 5 หลายเดือนก่อน
Yes
@prayanshusharma987 2 หลายเดือนก่อน
Bro make one video in ETL project with MySQL also
@guiltycrown6024 2 หลายเดือนก่อน
Is this prpject relevant to real life scenarios?
@manishasaxena9829 5 หลายเดือนก่อน
at 28:40, you said that we can't see null because of string split.. Just my thought, isn't it because you removed null at 8:44?
@ankitbansal6 5 หลายเดือนก่อน
I didn't remove it. It was just checking the max length and that time removed in analysis only. Not in actual data
@manishasaxena9829 5 หลายเดือนก่อน
@@ankitbansal6 oh yes, you're right, my bad.
Your content is really helpful and very easy to follow. keep uploading such videos. Thank you!
@pavanmadamset 5 หลายเดือนก่อน
Thank You Very Much Sir
@ankitbansal6 5 หลายเดือนก่อน
Most welcome
@bloofin5259 3 หลายเดือนก่อน
regarding the duplicates, can we just delete the duplicate rows? The search query for the duplciates is just a filter right, the database did not update?
@MayankGadiya-uq1el 5 หลายเดือนก่อน
please do a detail video on how to do connection from jupyter to sql and explain all engine conn, sqlalchemy etc
@ankitbansal6 5 หลายเดือนก่อน
Watch previous project video
@sakshiawadhiya7267 5 หลายเดือนก่อน ⁺¹
I am facing issues in jupyter notebook like path not exist
@neeraj_dama 5 หลายเดือนก่อน ⁺¹
well-done.
@Ankitkumar-ey4ef 2 หลายเดือนก่อน
i am using mysql and importing the csv file directly to sql server (without using python),only 100 rows are getting imported through mysql.can u tell me solution?
@mohammadfurquan241 5 หลายเดือนก่อน
Thanks alot sir.
I have a suggestion please at the end of the video or in description please put how someone can mention this project in resume with project description in bullet points I am a fresher so it will help me alot.
Thank you so much sir ❤
@LaxmiNarayan-pd8qn 3 หลายเดือนก่อน
Hello Sir!,
Really appreciate your efforts sir, i wanna ask something that can i add this project in my resume for data analyst role?
Someone who see this, please reply....
@ankitbansal6 3 หลายเดือนก่อน ⁺²
Yes, you can
@LaxmiNarayan-pd8qn 3 หลายเดือนก่อน
@@ankitbansal6 Okay Sir😊
@roopesh3837 5 หลายเดือนก่อน ⁺¹
In Netflix table why its 8807 it should be 8804 after removing 3 duplicates and where clause is removed by mistake?
@ankitbansal6 5 หลายเดือนก่อน ⁺¹
You are right where clause I missed to retain unique rows. My bad.
@simplytech4u898 5 หลายเดือนก่อน ⁺¹
Hi Ankit
there is column duration in netflix_raw table having values with min ,season so if need to find avg of duration for season as well how to get the details ,I believe we need to populate the values like other table we did. can you guide how we can do it..
@JayPatel-wv4mz หลายเดือนก่อน ⁺¹
Creating two new column
alter table netflix_raw
add column duration_in_minutes int,
add column total_number_of_seasons int;
-- Setting the values into this two new columns
update netflix_raw
set duration_in_minutes = case when type ='Movie' then substring_index(duration,' ',1) else null end ,
total_number_of_seasons = case when type='TV Show' then substring_index(duration,' ',1) else null end ;
@tanyachugh1640 5 หลายเดือนก่อน ⁺²
Hi @Ankit Bansal, Are there any additional settings needs to be done in SQL server management studio for the special characters to be visible. I have followed the steps twice, but still it is showing question mark for me.
@ankitbansal6 5 หลายเดือนก่อน
Data type should be nvarchar ?
@tanyachugh1640 5 หลายเดือนก่อน
@@ankitbansal6yes I am giving nvarchar only
@TarunDhimanOfficial 5 หลายเดือนก่อน
@@ankitbansal6 even after using nvarchar, special characters are still showing as ? ? ? ?.
@piyushsharma8294 5 หลายเดือนก่อน
check reply to @VaibhaviSuresh-bw8hq
@shubhamravikar6029 5 หลายเดือนก่อน ⁺¹
Hi @Ankit Bansal, I have tried a lot in creating a table using the nvarchar but still it shows the ??? Question mark sign and I have seen all the replies in the comment box but I couldn't find the solution for it. Please help it out so that I can proceed with the project.
@ankitbansal6 5 หลายเดือนก่อน
You can leave it as it is and proceed to the next tasks .
@shubhamravikar6029 5 หลายเดือนก่อน ⁺¹
@@ankitbansal6 Okay, Thanks
@abhinavumrao8453 5 หลายเดือนก่อน
For question number 2 for SQL analysis.
Your inner join with netflix table how you are joining on ng.show_id = nc.show_id.....shouldn't be ng.show_id = n.show_id ??
Please clarify my doubt 🙋‍♂️ 🙏.
@abhinavumrao8453 5 หลายเดือนก่อน
And if its wrong , how it still gave output for below mapping??
ng.show_id = nc.show_id
@BhakthiYoutube 5 หลายเดือนก่อน
Is it end to end data engineering project ? Looks like etl only rught
@mohammadfurquan241 5 หลายเดือนก่อน ⁺¹
It's a end to end ETL project which comes under Data Engineering. Hope you got it.
@rachitkeelpur 5 หลายเดือนก่อน
Please help me to by this combo course, i want to learn SQL in Hindi and python in English
@ankitbansal6 5 หลายเดือนก่อน
Send email to sql.namaste@gmail.com
@simplytech4u898 5 หลายเดือนก่อน
How to use PostgreSQL here if MS SQL is not present any ref video will be helpful..
@ankitbansal6 5 หลายเดือนก่อน
You can just Google . It's a simple change.
@simplytech4u898 5 หลายเดือนก่อน
i have figure it out how to import in postgreSQL thakns for amzing project video
@gamingfun5309 5 หลายเดือนก่อน ⁺¹
Sir how I can connect with mysql
@LearnDataSceince 5 หลายเดือนก่อน
import pandas as pd
import pymysql
from sqlalchemy import create_engine
# Database connection details
username = 'your username'
password = 'your password'
host = 'host'
port = 'port number'
database = 'your database name'
# Create pymysql connection
connection = pymysql.connect(host=host,
port=port,
user=username,
passwd=password,
db=database)
df = pd.read_csv('netflix_titles.csv')
connection_string = f"mysql+mysqlconnector://{username}:{password}@{host}/{database}"
engine = create_engine(connection_string)
try:
df.to_sql('netflix_raw', con=engine, index=False, if_exists='append')
print("DataFrame written to MySQL table 'netflix_raw' successfully.")
except Exception as e:
print(f"Error: {e}")
@ankitbansal6 5 หลายเดือนก่อน
Just Google. It's a simple change
@adijos92 4 หลายเดือนก่อน
Please help me any of video to give me to create directory of kaggle in local machine.m
@aminfaisalla 3 หลายเดือนก่อน
sir, my title stiill remain froreign language even after i change the data types to nvarchar, how do i fix this problem? thank you sir
@ankitbansal6 3 หลายเดือนก่อน ⁺¹
Leave it. Proceed to next steps
@aminfaisalla 3 หลายเดือนก่อน
@ankitbansal6 alright sir 👍
@JyotiPatra-y6w 5 หลายเดือนก่อน ⁺¹
Thank you very much sirji.... 🙏🙏🙏
@ankitbansal6 5 หลายเดือนก่อน ⁺¹
Most welcome
@itsyogijangir 5 หลายเดือนก่อน
How can we removed special sign like ₹ sign symbol in MSSQL server ,i am not able to do it.
@adilmajeed8439 5 หลายเดือนก่อน
Use replace function
@itsyogijangir 5 หลายเดือนก่อน
@@adilmajeed8439 not working for ₹ sign.
@niravshah5038 5 หลายเดือนก่อน
Even after giving data type as nvarchar, I cannot see other characters rather than english in my database
@Ayesha-wf3sy 2 หลายเดือนก่อน ⁺¹
I also faced the same issue. So this is how I resolved:
so when connecting to database write this line
from sqlalchemy import create_engine,NVARCHAR
engine=create_engine(.....)
dtype={'title':NVARCHAR(length=1000)}
conn=engine.connect()
df.to_sql('netflix',con=conn, index=false,if_exists='append',dtype=dtype)
conn.close()
@RohitAnnasahebRagde 2 หลายเดือนก่อน
@@Ayesha-wf3sy This works Thanks a lot I was stuck here for like an hour! Kudos!
@akshaysingh7962 หลายเดือนก่อน
@@Ayesha-wf3sy This was really helpful. Thanks
@VaibhaviSuresh-bw8hq 5 หลายเดือนก่อน
Hi @ankitbansal6, Thanks for making this video its really helpful and informative. I am also trying to implement the same but encountering one small issue, I am not able to convert the special characters into string even after changing the table definition to nvarchar still I ma getting the value as '????'. Can anyone help me with this? I have also tried to load the data using the encoding encoding='utf-8' in my pyspark script.
@piyushsharma8294 5 หลายเดือนก่อน
There seems to be a problem with collation & along with 'nvarchar', we need to change the collation for database as well.
You can fix that by writing this code:
ALTER DATABASE [Database_Name] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
ALTER DATABASE [Database_Name] Latin1_General_100_CS_AS_KS_WS_SC_UTF8;
GO
ALTER DATABASE [Database_Name] SET MULTI_USER;
GO
just adjust your database name in below [Database_Name] & it should work fine!
[Edit: these is slight change in the collation name]
@vaibhavisuresh04 5 หลายเดือนก่อน
Okay Thankyou!😊 I will try
@tanyachugh1640 5 หลายเดือนก่อน
@@vaibhavisuresh04 Hi, Could you please let me know, if the issue got resolved or not?
@sukhwinder101 5 หลายเดือนก่อน
bhai tumhara sql to bot bhadiya hai
@ladiashrith5230 5 หลายเดือนก่อน
Still I am getting Questions marks for title even it is nvarchar, how can I resolve it?😒
@ankithbansal6
@lal538 หลายเดือนก่อน
Same for me, did it get resolved?

ต่อไป

เล่นอัตโนมัติ

SQL Test Based on Real Interview | SQL Interview Questions and Answers