- 88
- 29 986
Data Engineering Studies
เข้าร่วมเมื่อ 2 ต.ค. 2011
Welcome to the 'Data Engineering Studies' channel, dedicated to aiding aspiring professionals aiming for roles such as Data Engineer, Data Scientist, Data Analyst, and other data-centric positions.
For those eager to expand their knowledge, I invite you to share your suggestions for new video topics. Simply reach out to me at faitus.jeline@gmail.com.
Your input ensures that I continue to deliver content that meets your needs and helps you advance in your data engineering journey.
Looking forward to hearing from you!
Thanks and regards,
Faitus Jeline Joseph
For those eager to expand their knowledge, I invite you to share your suggestions for new video topics. Simply reach out to me at faitus.jeline@gmail.com.
Your input ensures that I continue to deliver content that meets your needs and helps you advance in your data engineering journey.
Looking forward to hearing from you!
Thanks and regards,
Faitus Jeline Joseph
Apache Spark Deployment Modes
Spark runs in several modes, ranging from a single machine to a large-scale cluster of machines.
Offers three primary deployment modes
Client mode
Cluster mode
Local mode
#apachespark #dataengineering #pyspark
Offers three primary deployment modes
Client mode
Cluster mode
Local mode
#apachespark #dataengineering #pyspark
มุมมอง: 31
วีดีโอ
Apache Spark Architecture
มุมมอง 9714 วันที่ผ่านมา
Apache Spark Architecture #spark #pyspark #bigdata #dataengineering
Leetcode 62 - Unique Paths - Python Solution
มุมมอง 3828 วันที่ผ่านมา
In this video I solved the Leetcode 62 - Unique Paths using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/62_Unique_Paths.py #leetcode #dataengineering #python
Leetcode 647 - Palindromic Substrings - Python Solution
มุมมอง 71หลายเดือนก่อน
In this video I solved the Leetcode 647 - Palindromic Substrings using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/647_PalindromicSubstings.py
Leetcode 670 - Maximum Swap - Python Solution
มุมมอง 76หลายเดือนก่อน
In this video I solved the Leetcode 670 - Maximum Swap using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/670MaximumSwap.py
Leetcode 243 - Shortest Word Distance - Python Solution
มุมมอง 136หลายเดือนก่อน
In this video I solved the Leetcode 243 - Shortest Word Distance using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/243ShortestWordDistance.py
Leetcode 165 - Compare Version Numbers - Python Solution
มุมมอง 46หลายเดือนก่อน
In this video I solved the Leetcode 165 - Compare Version Numbers using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/165_CompareVersionNumbers.py
Leetcode 844 Backspace String Compare - Python Solution
มุมมอง 812 หลายเดือนก่อน
In this video I solved the Leetcode Leetcode 844 Backspace String Compare using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/844BackspaceStringCompare.py
Leetcode 150 Evaluate Reverse Polish Notation - Python Solution
มุมมอง 452 หลายเดือนก่อน
In this video I solved the Leetcode 150 Evaluate Reverse Polish Notation using Python programming language. Refer the solution in the below link github.com/faitusjelinej/Algorithms/blob/main/150_Evaluate_RPN.py
Join Strategies in Apache Spark
มุมมอง 1753 หลายเดือนก่อน
In this video, you will learn about the different Join Strategies in Apache Spark. Apache Spark has the following five algorithms to choose from 1. Broadcast Hash Join 2. Shuffle Hash Join 3. Shuffle Sort Merge Join 4. Broadcast Nested Loop Join 5. Cartesian Product Join
Azure SQL Managed Instance - Introduction
มุมมอง 754 หลายเดือนก่อน
Azure SQL Managed Instance - Introduction #azure #bigdata #azuretutorials #dataengineering
Leetcode 2053 Kth Distinct String in an Array - Python Solution
มุมมอง 314 หลายเดือนก่อน
In this video I solved the Leetcode 2053 Kth Distinct String in an Array using Python programming language. #dataengineering #leetcode #leetcodechallenge #leetcodethehardway #python
Leetcode 7 Reverse Integer - Python Solution
มุมมอง 1144 หลายเดือนก่อน
In this video I solved the Leetcode 7 Reverse Integer using Python programming language.
Schedule Trigger in Azure Pipelines
มุมมอง 794 หลายเดือนก่อน
This video provides detailed information about the schedule trigger and the steps to create, start, and monitor a Schedule Trigger in Azure Pipelines.
Unzip files, dynamically create folders and load files into respective folders using Azure Pipeline
มุมมอง 2334 หลายเดือนก่อน
Use Case: Unzip the files, dynamically create folders, and load the files into the respective folders using Azure Pipeline
Final Account Balance - SQL Interview Question
มุมมอง 444 หลายเดือนก่อน
Final Account Balance - SQL Interview Question
Dynamically ingest data from Azure SQL DB to Storage account using Azure Synapse Pipeline.
มุมมอง 1344 หลายเดือนก่อน
Dynamically ingest data from Azure SQL DB to Storage account using Azure Synapse Pipeline.
Data Processing with PySpark and SparkSQL
มุมมอง 2235 หลายเดือนก่อน
Data Processing with PySpark and SparkSQL
Nested forEach activity in Azure Data factory
มุมมอง 9975 หลายเดือนก่อน
Nested forEach activity in Azure Data factory
Leetcode 130 Surrounded Regions - Python Solution
มุมมอง 626 หลายเดือนก่อน
Leetcode 130 Surrounded Regions - Python Solution
Leetcode 64 Minimum Path Sum - Python Solution
มุมมอง 376 หลายเดือนก่อน
Leetcode 64 Minimum Path Sum - Python Solution
Leetcode 228 - Summary Ranges - Python Solution
มุมมอง 646 หลายเดือนก่อน
Leetcode 228 - Summary Ranges - Python Solution
Leetcode 205 - Isomorphic Strings - Python Solution
มุมมอง 866 หลายเดือนก่อน
Leetcode 205 - Isomorphic Strings - Python Solution
Leetcode 35 - Search Insert Position - Python Solution
มุมมอง 397 หลายเดือนก่อน
Leetcode 35 - Search Insert Position - Python Solution
Depth First Search (DFS) - Graph Traversal using Python
มุมมอง 387 หลายเดือนก่อน
Depth First Search (DFS) - Graph Traversal using Python
HackerRank - Decorators 2 - Name Directory - Python Solution
มุมมอง 2787 หลายเดือนก่อน
HackerRank - Decorators 2 - Name Directory - Python Solution
Leetcode 241 - Different Ways to Add Parentheses - Python Solution
มุมมอง 3718 หลายเดือนก่อน
Leetcode 241 - Different Ways to Add Parentheses - Python Solution
Tweets' Rolling Averages - SQL Interview Question
มุมมอง 768 หลายเดือนก่อน
Tweets' Rolling Averages - SQL Interview Question
Create Cosmos DB database, container, items and read items using Python
มุมมอง 3338 หลายเดือนก่อน
Create Cosmos DB database, container, items and read items using Python
Very simple and clear explanation
Glad it helped.
Very descriptive.
Glad you found it helpful!
Nice
Thanks, glad you liked it!
is it possible to add XML extension to pyspark-jupyter installation or is this extension only available for databrics?
To add XML extension to your PySpark-Jupyter installation, you'll need to install the spark-xml library. Open your Jupyter Notebook and run the following command !pip install spark-xml Now, you can use the library in your PySpark code
Hi, installling failed due to pemisssion deniel. It says permission denied! What am I doing wrong? Appreciate if you can help!
Run docker compose with sudo if you don't have permission sudo docker compose up
Hi! I can't to save file (csv from example) through dataframe.write into local docker folder. How I can deal with it?
I will get back on this shortly.
@@dataenggstudies thnx, I will be waiting!
Step 1: Mount a Volume When you run your Docker container, you need to mount a volume that will act as a bridge between the container's filesystem and your local filesystem. Step 2: Write the CSV File in Your Code Within your Python code running inside the Docker container, use the DataFrame's to_csv() method to write the CSV file to the mounted volume. df.to_csv("/path/in/container/your_file.csv", index=False) I will create a video on this steps.
the goat
Thanks!
Welcome!
thanks nice explanation
Glad you liked it
Useful
Glad it helped!
This is a very good and easy solution
Thank you. Glad it helped.
Need end to end project demo. Video
Glad it helped. Sure,thank you for the suggestion. I will work on end to end project demo. 👍
Thanks for the video! It was so simple and yet I was very confused till I got here.
Glad it helped!
Could you create tables in dedicated sql pool fro these files dynamically
Sure I will create a video for this scenario.
This was very helpful in getting my csv files into dataframes!
Glad it helped! Keep learning.
learn many things from this video...waiting for more videos
Good. Keep learning! I am glad it helped!
Really good explanation! Thanks for this!
You are welcome! Glad it helped.
Nice explanation. Expecting more videos like this. Requesting to do video how to load both files and SQL into azure folder using single dedicated pipeline.
Thanks for the idea! Noted. I will upload soon.
It can be one more approach for this senario using copy behavior
@@RajashekharKumbar-gj8wz you are correct.
Hello sir... Thank you for the solution.. can you please explain the same solution if XML file has varying nested data types
Sure. I will!
@@dataenggstudiesThank you...Also the nested data types may contain various depth level. So, flattening should be a dynamic logic. Is that possible?
Based on what I research dynamic flattening is not possible. I will share if I find any details
These videos are so helpful, simple yet so informative
Happy to hear that!
thanks for the video bro, but local host 4040 is not working
Local host 4040 might have been already used by another application in your device. Try using a different port.
@@dataenggstudies have tried using different ports, it didn't work
@@varmauppalapati7556 Could you please share the error you are getting.
Tqqqq so much bro❤
Glad it helped you.
I think you misspelled, as it should be a square of a number, not square-root of a number. But this is a very smart approach, thanks for sharing this work.
That is correct. Thank you for the correction. I am glad it help ❤
NICE EXPLAINATION.
Glad it helped.
Great
Glad it helped!
Great explanation! Thank you
Glad it helped!
thanks for the explanation .... it helped a lot..
Glad it helped!
Best solution thanks💯💯💯
Glad it helped!
Great to see clear instructions and simple approach
Glad it was helpful!
Amazing content , Thanks
Glad you liked it
Well explained, very interesting scenario!
Thank you.
Good Explanation. Expecting more scenarios from you
Thank you. I am glad it helped you!
Very straightforward and clear tutorial. Thank you, Joseph.
Glad it was helpful!
Can you demonstrate how to pgp encrypt a file in azure storage blob, using synapse notebooks where the public key is also in storage blob
Sure, I will, thank you for letting me know.
u can use unpivot functions
Yes looks like it is a new feature in spark version 3.4.0. Thank you for sharing. When recording this video this functionality was not available.
great content
Thank you. I am glad you liked.
Very helpful
Thank you. I am glad that it helped.
Good Content very much helpful. But please parameterize the target folder and Input folder
Thank you for the suggestion. I will incorporate in the upcoming videos.
Hi bro I have one scenario like i have a documents in cosmosdb for nosql and i want to create a pipeline to triggered it if certain value is updated in cosmosdb document like age=21then trigger the event and then perform some transformation using python and then send that changes to new cosmosdb container If you make one video on that scenario that could be great helpful
Sure, I will create a video for this scenario. Thank you for sharing
youre writing while inside for , doesnt that incrase the time complexity from n to m * n or something like that ?
This solution looks like O(n*m) but it is actually not because we are iterating over the while loop only for the numbers that are 'first' numbers, in other words the numbers that satisfy the condition (if n-1 not in nums:) not for all the numbers. Hence the time complexity is O(n).
bhai thanks for the python solution. there are not much resources available for dsa with python. keep going
Sure, I will upload more. Glad it helped.
Well done.thank you
Thank you.
Good informative. Keep it up
Thanks you.
Well explained😊
Thank you 🙂