AmpCode
AmpCode
  • 310
  • 1 888 367
Apache Spark Streaming DStream and Window Operations | Data Engineer Full Course | Lecture 22
Welcome to the twenty-second lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive deeper into Spark Streaming, focusing on DStreams (Discretized Streams) and windowed operations. These concepts are fundamental for performing advanced real-time data processing tasks.
🔍 What You'll Learn:
Introduction to DStreams and their role in Spark Streaming
Performing transformations on DStreams
Understanding windowed operations and their use cases
Real-world examples of applying windowed operations to streaming data
By the end of this lecture, you’ll be equipped with the knowledge to handle advanced streaming scenarios using DStreams and windowing techniques in Spark Streaming.
🔔 Don’t forget to subscribe to AmpCode for more lectures and updates. If you find this video helpful, please like and share it with others interested in data engineering. Let’s continue mastering real-time data processing together!
---------------------------------------------------------------------------------------------------------
Installation links:
Oracle VM Virtualbox: download.virtualbox.org/virtualbox/6.1.32/VirtualBox-6.1.32-149290-Win.exe
HDP Sandbox link(step-by-step procedure): hackmd.io/@firasj/BkSQJQ8eh
HDP Sandbox installation guide: hortonworks.com/tutorial/sandbox-deployment-and-install-guide/section/1/
-------------------------------------------------------------------------------------------------------------
Also check out our full Apache Hadoop course:
th-cam.com/play/PL6UwySlcwEYJ2hFuGIvr4VEHUAfl-GCNT.html
----------------------------------------------------------------------------------------------------------------------
Apache Spark Installation links:
1. Download JDK: www.oracle.com/in/java/technologies/downloads/#jdk19-windows
2. Download Python: www.python.org/downloads/
3. Download Spark: spark.apache.org/downloads.html
-------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------
Also check out similar informative videos in the field of cloud computing:
What is Big Data: th-cam.com/video/-BoykjY5nKg/w-d-xo.html
How Cloud Computing changed the world: th-cam.com/video/lf2lQAyW2b4/w-d-xo.html
What is Cloud? th-cam.com/video/DeCMeA9Xm2g/w-d-xo.html
Top 10 facts about Cloud Computing that will blow your mind! th-cam.com/video/hmxNJEQ4XVY/w-d-xo.html
Audience
This tutorial has been prepared for professionals/students aspiring to learn deep knowledge of Big Data Analytics using Apache Spark and become a Spark Developer and Data Engineer roles. In addition, it would be useful for Analytics Professionals and ETL developers as well.
Prerequisites
Before proceeding with this full course, it is good to have prior exposure to Python programming, database concepts, and any of the Linux operating system flavors.
-----------------------------------------------------------------------------------------------------------------------
Check out our full course topic wise playlist on some of the most popular technologies:
SQL Full Course Playlist-
th-cam.com/play/PL6UwySlcwEYISVLQlYi3W6rGCIo9sJM0J.html
PYTHON Full Course Playlist-
th-cam.com/play/PL6UwySlcwEYJgM4eUQOvR1KAWryFYcclq.html
Data Warehouse Playlist-
th-cam.com/play/PL6UwySlcwEYKxi-fQHLkVYDZrJcBawZA9.html
Unix Shell Scripting Full Course Playlist-
th-cam.com/play/PL6UwySlcwEYIZGsbXnUxsojD0yeUA67lb.html
-----------------------------------------------------------------------------------------------------------------------Don't forget to like and follow us on our social media accounts:
Facebook-
ampcode
Instagram-
ampcode_tutorials
Twitter-
ampcodetutorial
Tumblr-
ampcode.tumblr.com
-----------------------------------------------------------------------------------------------------------------------
Channel Description-
AmpCode provides you e-learning platform with a mission of making education accessible to every student. AmpCode will provide you tutorials, full courses of some of the best technologies in the world today. By subscribing to this channel, you will never miss out on high quality videos on trending topics in the areas of Big Data & Hadoop, DevOps, Machine Learning, Artificial Intelligence, Angular, Data Science, Apache Spark, Python, Selenium, Tableau, AWS , Digital Marketing and many more.
#pyspark #bigdata #datascience #dataanalytics #datascientist #spark #dataengineering #apachespark
มุมมอง: 53

วีดีโอ

Real Time Data Processing using Spark Streaming Made EASY! | Data Engineer Full Course | Lecture 21
มุมมอง 113วันที่ผ่านมา
Welcome to the twenty-first lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll explore Spark Streaming, a powerful tool for real-time data processing. This lecture simplifies the concepts of streaming and demonstrates how Spark processes live data efficiently. 🔍 What You'll Learn: What is Spark Streaming and its key features Setting up Spark Streaming for real...
Working with DataFrame and Spark SQL | Data Engineer Full Course | Lecture 20
มุมมอง 21214 วันที่ผ่านมา
Welcome to the twentieth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll combine the power of DataFrames and Spark SQL to efficiently handle and analyze structured data. This session is designed to enhance your understanding of Spark's versatile data processing capabilities. 🔍 What You'll Learn: How to integrate DataFrames with Spark SQL Writing SQL queries...
Spark Optimization Techniques | Data Engineer Full Course | Lecture 19
มุมมอง 16821 วันที่ผ่านมา
Welcome to the nineteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive into optimization techniques in Apache Spark and PySpark, helping you enhance the performance of your big data processing tasks. Optimization is key to efficiently managing resources and processing large datasets. 🔍 What You'll Learn: The importance of optimization in Spark and P...
Apache Spark Basic DataFrame Operation | Data Engineer Full Course | Lecture 18
มุมมอง 14128 วันที่ผ่านมา
Welcome to the eighteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll focus on basic operations with Spark DataFrames. Understanding these operations is critical for manipulating and analyzing structured data effectively in Spark. 🔍 What You'll Learn: How to create DataFrames from different data sources Performing basic operations like select, filter, a...
Working with Spark SQL | Data Engineer Full Course | Lecture 17
มุมมอง 258หลายเดือนก่อน
Welcome to the seventeenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll explore Spark SQL, a module in Apache Spark that enables querying structured data using SQL syntax. Spark SQL is a must-know tool for data engineers working with structured data at scale. 🔍 What You'll Learn: Introduction to Spark SQL and its advantages Creating and querying DataFram...
Introduction to Spark DataFrame | Data Engineer Full Course | Lecture 16
มุมมอง 166หลายเดือนก่อน
Welcome to the sixteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll introduce Spark DataFrames, a powerful data abstraction in Spark that simplifies big data processing. DataFrames are essential for handling structured data, and mastering them is crucial for efficient data engineering. 🔍 What You'll Learn: What Spark DataFrames are and how they differ ...
Writing and Running Spark Application in Python | Data Engineer Full Course | Lecture 15
มุมมอง 169หลายเดือนก่อน
Welcome to the fifteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll cover how to write and run a Spark application in Python using PySpark. This hands-on session will help you understand the basics of Spark application development and get you started with writing your own Spark jobs. 🔍 What You'll Learn: Setting up PySpark for Spark application develop...
Apache Spark Transformations and Actions | Data Engineer Full Course | Lecture 14
มุมมอง 147หลายเดือนก่อน
Welcome to the fourteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive into transformations and actions in Apache Spark, the two main types of operations on RDDs that are key to processing data. Knowing how to use transformations and actions will allow you to build powerful data processing pipelines in Spark. 🔍 What You'll Learn: The difference betw...
Apache Spark RDD Explained | Data Engineer Full Course | Lecture 13
มุมมอง 204หลายเดือนก่อน
Welcome to the thirteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll explore the concept of RDDs (Resilient Distributed Datasets) in Apache Spark. RDDs are the core abstraction in Spark, and understanding them is essential for effective big data processing. 🔍 What You'll Learn: What RDDs are and their role in Apache Spark Key properties of RDDs: immuta...
Apache Spark Architecture | Data Engineer Full Course | Lecture 12
มุมมอง 282หลายเดือนก่อน
Welcome to the twelfth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive deep into the architecture of Apache Spark, which is key to understanding how Spark achieves its speed and scalability. Knowing Spark’s architecture will help you make the most of this powerful data processing engine. 🔍 What You'll Learn: Overview of Apache Spark's architecture The ...
Introduction to Apache Spark | Data Engineer Full Course | Lecture 11
มุมมอง 239หลายเดือนก่อน
Welcome to the eleventh lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll introduce you to Apache Spark, a powerful open-source engine for big data processing. Spark’s speed and versatility make it a must-have tool for modern data engineers, and this lecture will lay the foundation for mastering Spark. 🔍 What You'll Learn: What is Apache Spark and how it diff...
Working with HDFS and running a MapReduce Job | Data Engineer Full Course | Lecture 10
มุมมอง 3183 หลายเดือนก่อน
Welcome to the tenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll combine our knowledge of Hadoop HDFS and MapReduce to run a MapReduce job on the Hadoop Distributed File System. This practical session will demonstrate how to work with data in HDFS and process it using MapReduce. 🔍 What You'll Learn: How to upload and manage data in HDFS Steps to configu...
Building a simple MapReduce Job | Data Engineer Full Course | Lecture 9
มุมมอง 1923 หลายเดือนก่อน
Welcome to the ninth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll take a hands-on approach to building a simple MapReduce job in Hadoop. This practical session will help you understand how MapReduce works in action and give you the foundation to build more complex data processing tasks. 🔍 What You'll Learn: Setting up the environment for building a MapRe...
Introduction to YARN in Hadoop | Data Engineer Full Course | Lecture 8
มุมมอง 2373 หลายเดือนก่อน
Welcome to the eighth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we will explore YARN (Yet Another Resource Negotiator), a fundamental component of Hadoop that manages resources in a distributed environment. Understanding YARN is essential for optimizing the performance and scalability of Hadoop clusters. 🔍 What You'll Learn: What is YARN and its role in the...
What is MapReduce in Hadoop | Data Engineer Full Course | Lecture 7
มุมมอง 1983 หลายเดือนก่อน
What is MapReduce in Hadoop | Data Engineer Full Course | Lecture 7
Understanding Hadoop HDFS | Data Engineer Full Course | Lecture 6
มุมมอง 2333 หลายเดือนก่อน
Understanding Hadoop HDFS | Data Engineer Full Course | Lecture 6
Install Hadoop on Windows | Data Engineer Full Course | Lecture 5
มุมมอง 1K3 หลายเดือนก่อน
Install Hadoop on Windows | Data Engineer Full Course | Lecture 5
Install Apache Spark PySpark on Windows | Data Engineer Full Course | Lecture 4
มุมมอง 1.3K3 หลายเดือนก่อน
Install Apache Spark PySpark on Windows | Data Engineer Full Course | Lecture 4
Use Cases and Scenarios for Hadoop and Spark | Data Engineer Full Course | Lecture 3
มุมมอง 2313 หลายเดือนก่อน
Use Cases and Scenarios for Hadoop and Spark | Data Engineer Full Course | Lecture 3
Introduction to Hadoop and Spark | Data Engineer Full Course | Lecture 2
มุมมอง 3893 หลายเดือนก่อน
Introduction to Hadoop and Spark | Data Engineer Full Course | Lecture 2
Overview of Data Engineering | Data Engineer Full Course | Lecture 1
มุมมอง 5133 หลายเดือนก่อน
Overview of Data Engineering | Data Engineer Full Course | Lecture 1
Neo4j Cypher Aggregating Functions | Neo4j Tutorial | Lecture 12
มุมมอง 5635 หลายเดือนก่อน
Neo4j Cypher Aggregating Functions | Neo4j Tutorial | Lecture 12
Neo4j Cypher Scalar Functions | Neo4j Tutorial | Lecture 11
มุมมอง 4785 หลายเดือนก่อน
Neo4j Cypher Scalar Functions | Neo4j Tutorial | Lecture 11
Neo4j Cypher Predicate Functions | Neo4j Tutorial | Lecture 10
มุมมอง 6135 หลายเดือนก่อน
Neo4j Cypher Predicate Functions | Neo4j Tutorial | Lecture 10
Neo4j Cypher Values and Data Types | Neo4j Tutorial | Lecture 9
มุมมอง 6025 หลายเดือนก่อน
Neo4j Cypher Values and Data Types | Neo4j Tutorial | Lecture 9
Neo4j Cypher Patterns | Neo4j Tutorial | Lecture 8
มุมมอง 8706 หลายเดือนก่อน
Neo4j Cypher Patterns | Neo4j Tutorial | Lecture 8
Neo4j Cypher Subqueries | Neo4j Tutorial | Lecture 7
มุมมอง 1.4K6 หลายเดือนก่อน
Neo4j Cypher Subqueries | Neo4j Tutorial | Lecture 7
Neo4j Cypher Clauses | Neo4j Tutorial | Lecture 6
มุมมอง 4.2K9 หลายเดือนก่อน
Neo4j Cypher Clauses | Neo4j Tutorial | Lecture 6
Real-time vs Batch Data Processing
มุมมอง 67610 หลายเดือนก่อน
Real-time vs Batch Data Processing

ความคิดเห็น

  • @djjames-u1b
    @djjames-u1b วันที่ผ่านมา

    indeed video is great but while explaining try to explain by considering the points you have put on your screen otherwise its bit confusing to know which point you are talking about......

  • @FEYSALAL-RAHMANMOUCKEYTOU
    @FEYSALAL-RAHMANMOUCKEYTOU 2 วันที่ผ่านมา

    operation_manager don't appear or simply [PATH_NOT_FOUND] Path does not exist

  • @Ohisthisyou
    @Ohisthisyou 2 วันที่ผ่านมา

    can someone help , i have downloaded hadoop 3.3 which is the newest version but it is not showing in github . what to do ?

  • @pratiksingh5022
    @pratiksingh5022 3 วันที่ผ่านมา

    After clicking on launch neo4j the application is not opening in my system

  • @strawberrycy
    @strawberrycy 5 วันที่ผ่านมา

    this is a lifesaver!

  • @plutonium4574
    @plutonium4574 8 วันที่ผ่านมา

    Ghanta .. architecture kuch samajh nhi aya

  • @Leerosasi
    @Leerosasi 8 วันที่ผ่านมา

    I am getting this error "WARN NativeCodeLoader: Unable to load native-hadoop library for your platform" I could get the hadoop 2.7, instead I used hadoop 3.3 and its respective winutils. Advise.

  • @ranveersankpal5156
    @ranveersankpal5156 9 วันที่ผ่านมา

    great 🤩

  • @karthikgandi1677
    @karthikgandi1677 9 วันที่ผ่านมา

    Note - It's not ---partition, but ---partitions

  • @naren06938
    @naren06938 9 วันที่ผ่านมา

    This type all services at single platform Nice, but any alternative for online for practice like cloud type, but for free?

  • @manikantsharma3496
    @manikantsharma3496 10 วันที่ผ่านมา

    Puri video shkal dikhane mei nikaal di

  • @jaykadu6835
    @jaykadu6835 11 วันที่ผ่านมา

    I want to run a workflow engine project on airflow using docker how can i do that, Do i have to do additional steps. If yes, can you provide it.

  • @ravi-y7b1d
    @ravi-y7b1d 11 วันที่ผ่านมา

    facing the error while saving the file

  • @ravi-y7b1d
    @ravi-y7b1d 13 วันที่ผ่านมา

    i did everything exact same but for the later versions of spark 3.x java 8 or 11 is required so just download the java 8 or 11 version and it worked for me.

  • @parvadhami980
    @parvadhami980 14 วันที่ผ่านมา

    To those who are getting error in CMD:Use "./spark-shell" Instead of just spark-shell in CMD

  • @ArunKumar-wi2et
    @ArunKumar-wi2et 14 วันที่ผ่านมา

    Bro in jupyter throwing error

  • @prasadbarla7215
    @prasadbarla7215 15 วันที่ผ่านมา

    spark runs only on java 8 or 11 version it doesn't work with latest version I've tried it

  • @Jayamathi-y1b
    @Jayamathi-y1b 15 วันที่ผ่านมา

    One of the best . Thank you very much

  • @dhananjayapattnaik7428
    @dhananjayapattnaik7428 15 วันที่ผ่านมา

    last two days onward i am struggling to install it..please help me

  • @dhananjayapattnaik7428
    @dhananjayapattnaik7428 15 วันที่ผ่านมา

    request returned Internal Server Error for API route and version %2F%2F.%2Fpipe%2FdockerDesktopLinuxEngine/v1.47/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Ddairflow%22%3Atrue%7D%7D, check if the server supports the requested API version i am getting this issue

  • @gangatharan-x4r
    @gangatharan-x4r 16 วันที่ผ่านมา

    very nice explained

  • @puravraj3517
    @puravraj3517 17 วันที่ผ่านมา

    I am unable to install python and unable to install the epel https

  • @praveenguduru160
    @praveenguduru160 17 วันที่ผ่านมา

    Great content

  • @MunniV-c1f
    @MunniV-c1f 18 วันที่ผ่านมา

    PS C:\Users\dataeng> docker-compose up -d when i am using this command it is showing errors like---- no configuration file provided: not found

  • @Munninarendra
    @Munninarendra 19 วันที่ผ่านมา

    sir is it possible to install airflow without docker

  • @AbhiShek-m6s
    @AbhiShek-m6s 22 วันที่ผ่านมา

    I did everything until the environment variables setup, still while using cmd spark-shell it is giving me "'spark-shell' is not recognized as an internal or external command, operable program or batch file." versions I used - For Java: java version "11.0.24" 2024-07-16 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.24+7-LTS-271) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.24+7-LTS-271, mixed mode) For Python: Python 3.11.0rc2 For Spark: spark-3.5.3-bin-hadoop3 For Hadoop: (file from below location) winutils/hadoop-3.3.6/bin /winutils.exe

  • @SupravaMishra-e4d
    @SupravaMishra-e4d 22 วันที่ผ่านมา

    Spark-Shell is not running

  • @SupravaMishra-e4d
    @SupravaMishra-e4d 22 วันที่ผ่านมา

    I am getting errors continuously after doing the same procedure as well, please reply to me.

  • @vasutke1187
    @vasutke1187 24 วันที่ผ่านมา

    High clarity and useful. Thanks Sir

  • @geetakavalad8983
    @geetakavalad8983 24 วันที่ผ่านมา

    I have followed all the steps and added all the system variables but at that time winutils file was not present in my system

    • @geetakavalad8983
      @geetakavalad8983 24 วันที่ผ่านมา

      Now I have that file how to make the changes plz let me know

  • @k-universe0022
    @k-universe0022 24 วันที่ผ่านมา

    plz teach more about programs like how to implement these in scenario based programs

  • @k-universe0022
    @k-universe0022 24 วันที่ผ่านมา

    your method of teaching is so good 👍

  • @rithvikramdas323
    @rithvikramdas323 26 วันที่ผ่านมา

    Getting PY4javaerror, i have followed all the installation steps

  • @aniketrele7688
    @aniketrele7688 28 วันที่ผ่านมา

    Do you have idea how to decrypt client side data of Mongo using spark?

  • @sigmaprideu
    @sigmaprideu 28 วันที่ผ่านมา

    Very good explanation ❤❤

  • @ameenullahsyed8526
    @ameenullahsyed8526 28 วันที่ผ่านมา

    can I get your email address, wanted to get in touch with you

  • @udaykumar-tb5kn
    @udaykumar-tb5kn หลายเดือนก่อน

    How to open Linux terminal?? Do u use Amazon Linux or how u able to enter all these

  • @Ahmmmm-y2b
    @Ahmmmm-y2b หลายเดือนก่อน

    Sir i need java script

  • @Codeyug
    @Codeyug หลายเดือนก่อน

    Great brother..Thanks from codeyug

  • @tejaschaudhari6424
    @tejaschaudhari6424 หลายเดือนก่อน

    but what if we have installed apache spark manually? I have done this so when I am trying to execute my pyspark script in spyder it's saying no module name pyspark.

  • @naren06938
    @naren06938 หลายเดือนก่อน

    Please try to make videos bit interesting Even bore theory also...you are reading PDFs continuously....iam getting sleepy

  • @rahmaesam2732
    @rahmaesam2732 หลายเดือนก่อน

    still hadoop not recognize even with your installation it give you a warning message " unable to load native.hadoop library"

  • @donjuancapistrano2382
    @donjuancapistrano2382 หลายเดือนก่อน

    The best video on installing payspark, even in 2024. Many thanks to the author!

    • @playtrip7528
      @playtrip7528 หลายเดือนก่อน

      which spark version did u downloaded ?

    • @donjuancapistrano2382
      @donjuancapistrano2382 หลายเดือนก่อน

      @playtrip7528 I downloaded 3.5.3 and pre build for Hadoop 3.3 with 3.0.0 winutils

    • @donjuancapistrano2382
      @donjuancapistrano2382 หลายเดือนก่อน

      ​@@playtrip7528 I downloaded a 3.5.3 version of pyspark and 3.3 pre built for Hadoop with 3.0.0 winutils

  • @anuraggupta5665
    @anuraggupta5665 หลายเดือนก่อน

    Hi @AmpCode Thanks for the great tutorial. I followed each steps and spark is working fine. But when I'm executing some of my pyspark script, I'm getting below Hadoop error: ERROR SparkContext: Error initializing SparkContext. java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. Can you please help me on this urgently.. I have set all paths as you showed in video but I'm not able to solve this error. Please Help.

  • @MOHITTHAKKAR-v6d
    @MOHITTHAKKAR-v6d หลายเดือนก่อน

    hindi bolo

  • @fristname-of6ko
    @fristname-of6ko หลายเดือนก่อน

    Tq so much must needed❤❤❤

  • @ronaldpai
    @ronaldpai หลายเดือนก่อน

    "It's a fairly small file, only 573 MB" 😂

  • @gbs7212
    @gbs7212 หลายเดือนก่อน

    thank you so much, very helpful! The only error I got was running spark-shell, but from other comments I figured out that you can either run the command prompt as admin or cd into the spark folder and then call it

  • @parameshd6130
    @parameshd6130 หลายเดือนก่อน

    Hi sir, Is it possible to update the partial object in Mongo DB using spark

  • @nirajkarki6003
    @nirajkarki6003 หลายเดือนก่อน

    Very nice video can you create more of it