Intro to Data Visualization with R & ggplot2

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 ก.ค. 2024
  • In this webinar, we will provide an introduction to data visualization with the ggplot2 package. The focus of the webinar will be using ggplot2 to analyze your data visually with a specific focus on discovering the underlying signals/patterns of your business.
    The R programming language is experiencing rapid increases in popularity and wide adoption across industries. This popularity is due, in part, to R’s rich and powerful data visualization capabilities. While tools like Excel, Power BI, and Tableau are often the go-to solutions for data visualizations, none of these tools can compete with R in terms of the sheer breadth of, and control over, crafted data visualizations.
    As an example, R’s ggplot2 package provides the R programmer with dozens of print-quality visualizations - where any visualization can be heavily customized with a minimal amount of code.
    In this talk attendees will learn how to:
    • Craft ggplot visualizations, including customization of rendered output.
    • Choose optimal visualizations for the type of data and the nature of the analysis at hand.
    • Leverage ggplot2’s powerful segmentation capabilities to achieve “visual drill-in of data”.
    • Export ggplot2 visualizations from RStudio for use in documents and presentations.
    Repository:
    code.datasciencedojo.com/data...
    Table of Contents:
    0:00 Introduction
    6:19 Titanic dataset
    14:07 ggplot2
    27:35 Data analysis
    32:45 Factor variables
    33:10 Hypothesis data
    46:26 Visualization
    54:04 Age
    56:34 Data visualization
    --
    At Data Science Dojo, we believe data science is for everyone. Our data science trainings have been attended by more than 10,000 employees from over 2,500 companies globally, including many leaders in tech like Microsoft, Google, and Facebook. For more information please visit: hubs.la/Q01Z-13k0
    💼 Learn to build LLM-powered apps in just 40 hours with our Large Language Models bootcamp: hubs.la/Q01ZZGL-0
    💼 Get started in the world of data with our top-rated data science bootcamp: hubs.la/Q01ZZDpt0
    💼 Master Python for data science, analytics, machine learning, and data engineering: hubs.la/Q01ZZD-s0
    💼 Explore, analyze, and visualize your data with Power BI desktop: hubs.la/Q01ZZF8B0
    --
    Unleash your data science potential for FREE! Dive into our tutorials, events & courses today!
    📚 Learn the essentials of data science and analytics with our data science tutorials: hubs.la/Q01ZZJJK0
    📚 Stay ahead of the curve with the latest data science content, subscribe to our newsletter now: hubs.la/Q01ZZBy10
    📚 Connect with other data scientists and AI professionals at our community events: hubs.la/Q01ZZLd80
    📚 Checkout our free data science courses: hubs.la/Q01ZZMcm0
    📚 Get your daily dose of data science with our trending blogs: hubs.la/Q01ZZMWl0
    --
    📱 Social media links
    Connect with us: / data-science-dojo
    Follow us: / datasciencedojo
    Keep up with us: / data_science_dojo
    Like us: / datasciencedojo
    Find us: www.threads.net/@data_science...
    --
    Also, join our communities:
    LinkedIn: / 13601597
    Twitter: / 1677363761399865344
    Facebook: / aiandmachinelearningfo...
    Vimeo: vimeo.com/datasciencedojo
    Discord: / discord
    _
    Want to share your data science knowledge? Boost your profile and share your knowledge with our community: hubs.la/Q01ZZNCn0
    #datavisualization #rprogramming #ggplot2

ความคิดเห็น • 128

  • @asimmunshi4830
    @asimmunshi4830 4 ปีที่แล้ว +5

    This was amazing - thanks. It was, literally, the first time I've ever coded anything in my life, I've wanted to learn about data vis stuff for sports analytics for a long time and this video was the perfect introduction.
    If anyone is in a similar position to myself, and has zero ! previous knowledge of R or programming, and wants to learn about data vis, I'd just start with this video. The only thing I needed to hit google for was to learn how to import the dataset into RStudio (yes, really).
    Thank you!

  • @kushagramishra9257
    @kushagramishra9257 6 ปีที่แล้ว +55

    I would suggest everyone beginning with ggplot2 to go through this 1hr vedio, it will save you a lot of time understanding the basics.

    • @Datasciencedojo
      @Datasciencedojo  6 ปีที่แล้ว +3

      @Kushagra Mishra - You are too kind, glad you liked the video!
      Dave

    • @deepakhemadri
      @deepakhemadri 6 ปีที่แล้ว

      Kushagra Mishra uwl

  • @AliAwadh980
    @AliAwadh980 6 ปีที่แล้ว +25

    The seventh question, I believe the labs should be as:
    labs(x = "Age", y = "Density"), and
    labs(x = "Age", y = "Survived Count")

  • @Jenna-iu2lx
    @Jenna-iu2lx 2 ปีที่แล้ว +5

    This video is literally PERFECT for ggplot2 beginners! In only one hour, you'll learn the basics of ggplot2 R coding and you'll end up falling in love with ggplot2 (I thought this language was weird and not intuitive at first, but after this video I think it's very useful and practical to visualize more accurate data plots!)

  • @maurocarvalho9835
    @maurocarvalho9835 4 ปีที่แล้ว +1

    Great presentation! Thanks for making ggplot2 easier to be understood.

  • @AncillenoDavis
    @AncillenoDavis 6 ปีที่แล้ว

    Definitely one of the best intros to ggplot2

  • @shahronak47
    @shahronak47 6 ปีที่แล้ว +184

    26:30 - Actual video

  • @thedeadman8361
    @thedeadman8361 6 ปีที่แล้ว

    Great intro to ggplot2. Made the basics very clear.

  • @ayasuniverse
    @ayasuniverse 3 ปีที่แล้ว +1

    THIS IS ABSOLUETELY ONE OF THE BEST TUTORIALS ON CODING THAT I'VE EVER SEEN !!!! THANK YOUUU !! UP UP UP

  • @najibbht9727
    @najibbht9727 4 ปีที่แล้ว +1

    Super helpful and crystal clear intro. Thank you very much!

  • @joanabucheri1380
    @joanabucheri1380 4 ปีที่แล้ว +1

    Thank you for this video, it was indeed helpful. Didn't have sufficient knowledge in ggplots but now i do. Thanks a lot!

  • @charleschungudaka6879
    @charleschungudaka6879 3 ปีที่แล้ว

    Wow, the Best video on ggplot2. Love you Data Science Dojo. So very much helpful and really got me excited.

  • @examtest2883
    @examtest2883 6 ปีที่แล้ว

    worth watching 1 hr..Really helpful. Thanks a lot

  • @martinwilson1729
    @martinwilson1729 6 ปีที่แล้ว +1

    Very helpful and appreciated, thanks for uploading

  • @andymonks2808
    @andymonks2808 6 ปีที่แล้ว

    excellent video! Thank you very much Dave

  • @shreyachauhan9181
    @shreyachauhan9181 3 ปีที่แล้ว

    oh man I can't thank enough, you are so good I lost my mind in understanding u hold my back, thanks

  • @levibauer375
    @levibauer375 4 ปีที่แล้ว

    Found this super helpful! Thanks so much

  • @mohamedbelo5462
    @mohamedbelo5462 6 ปีที่แล้ว

    Wonderful, this is was so useful and one hour full of knowledge and hand on practice.
    Thanks alot guys !
    Belo

    • @lcaslokonon3979
      @lcaslokonon3979 5 ปีที่แล้ว

      Thank you for making this great tutorial! It's easy and simple to follow! I've learned a ton from it; keep making more, please!

  • @jiyaadnaeem4802
    @jiyaadnaeem4802 4 ปีที่แล้ว

    Great tutorial, thank you!

  • @mohdyounisbhat2447
    @mohdyounisbhat2447 3 ปีที่แล้ว

    Such a wonderful video!!!So simple and easy way to make it understand

  • @Jxdemelo1961
    @Jxdemelo1961 3 ปีที่แล้ว +1

    Simply exceptional. Thank you. I'm hooked. And I'm not even a Data Scientist. If I was 20 years younger, I'd get into this field.

    • @Datasciencedojo
      @Datasciencedojo  3 ปีที่แล้ว

      Glad to hear that, Jaime. It's still not too late to start with data science. Take a look at our Bootcamp, which might be a great way for you to start: datasciencedojo.com/data-science-bootcamp/

  • @asifkhan-fr4wb
    @asifkhan-fr4wb 6 ปีที่แล้ว +2

    very nice explanation with the dataset. Thank You.

  • @jiangxu3895
    @jiangxu3895 4 ปีที่แล้ว

    Thank you very much for your explanation.

  • @mursyidhb
    @mursyidhb 3 ปีที่แล้ว +3

    Thanks Dave... I think the way you present the code and interpreting the result is awesome. even I have just new to ggplot2, the presentation bring me as if we have so familiar with the code. You make R is not that difficult.

    • @Datasciencedojo
      @Datasciencedojo  2 ปีที่แล้ว +1

      glad to help you out, keep following us for more content!

  • @ntimdomfeh1959
    @ntimdomfeh1959 6 ปีที่แล้ว

    Thank you very much. You are far too kind

  • @techknowhow4802
    @techknowhow4802 6 ปีที่แล้ว

    Good lecture on ggplot and its functionalities. I liked the examples. I would have liked to see it go a little deeper into examples coders and analysts can use directly in their analysis and data science problems. Thank you.

  • @rahulfederer20
    @rahulfederer20 ปีที่แล้ว

    Legit the perfect video for a beginner. Thanks a ton man

    • @Datasciencedojo
      @Datasciencedojo  ปีที่แล้ว

      Keep following us for more crash courses!

  • @khoadang4844
    @khoadang4844 5 ปีที่แล้ว +3

    Currently a college student pursuing a degree in Economics. I'm taking Intro to Economic Data Analysis, and we have the choice of using R or Excel. Our first Homework Project directly coincides with using ggplot2. I haven't even finished half of the video but can already say I have learned so much about R. Sweet vid!

  • @sahalnaz3904
    @sahalnaz3904 6 ปีที่แล้ว +1

    very useful video... thank you

  • @tassoskat8623
    @tassoskat8623 3 ปีที่แล้ว +1

    I just started loving R

  • @constantineau9686
    @constantineau9686 3 ปีที่แล้ว +1

    An excellent video! Thanks a lot!

  • @AYODEJIIYANDA
    @AYODEJIIYANDA 6 ปีที่แล้ว

    This is a great tur]torial, good job

  • @ruizhang3956
    @ruizhang3956 4 ปีที่แล้ว +2

    Thanks for the tutorial! Small caveats on the density plot and the histogram towards the end. The axes are mislabeled. Y should be probability density or counts, while X should be age

  • @eyadha1
    @eyadha1 3 ปีที่แล้ว

    thank you very much, very helpful for me.

  • @pewolo
    @pewolo 2 ปีที่แล้ว +1

    Clear and pertinent!

  • @pamelabenitez7077
    @pamelabenitez7077 6 ปีที่แล้ว

    Nice video to get you hook with ggplot2

  • @gyandeepsharma5516
    @gyandeepsharma5516 4 ปีที่แล้ว

    thanks for such good vedio. Loved it,

  • @RamadanElsharif
    @RamadanElsharif 3 ปีที่แล้ว +1

    So clear and nice lecture. Thank you so much.

    • @Datasciencedojo
      @Datasciencedojo  3 ปีที่แล้ว +1

      Glad you liked it, stay tuned for more lectures!

  • @CharwakApte
    @CharwakApte 6 ปีที่แล้ว

    Infinite SNR - Thanks!

  • @victorsamuel6199
    @victorsamuel6199 2 ปีที่แล้ว

    This is so awesome. Thank you so much.

  • @aegystierone8505
    @aegystierone8505 4 ปีที่แล้ว

    Incredible, telling a story with data!

  • @WahranRai
    @WahranRai 6 ปีที่แล้ว +1

    46:08 May be instead of using copy and paste, we could use, for example:
    ggplt = ggplot2(titanic,aes...) and add layers to that
    ggplt +
    theme_bw().+
    labs()....

  • @darren46166
    @darren46166 5 ปีที่แล้ว +4

    Mistakes in your code 162, 163, 171 and 172. The x axis should be "Age" and y axis should be "Survived". By the way, great tutorial!

  • @afifkhaja
    @afifkhaja 5 ปีที่แล้ว +1

    Superb presentation

  • @Justme1635438
    @Justme1635438 4 ปีที่แล้ว

    I loaded the dataset in both SPSS and R and did all of the plots - to me SPSS was more easy to use, but the plots actually look better in R. Great video.

  • @tuanlong9238
    @tuanlong9238 6 ปีที่แล้ว

    thanks a lot !!!

  • @nkristianschmidt
    @nkristianschmidt 6 ปีที่แล้ว +1

    Very helpful. I think at the end, the density plot vs histograms issue is, the layered density plots show two different distributions of age and the histograms show one distribution of age and bi-color that distribution by survival. Two different things.

    • @brantleygrey696
      @brantleygrey696 2 ปีที่แล้ว

      i guess im asking randomly but does anybody know a trick to log back into an instagram account??
      I was dumb forgot the password. I would appreciate any assistance you can give me

    • @chaimvance8608
      @chaimvance8608 2 ปีที่แล้ว

      @Brantley Grey Instablaster :)

    • @brantleygrey696
      @brantleygrey696 2 ปีที่แล้ว

      @Chaim Vance I really appreciate your reply. I found the site on google and I'm trying it out now.
      Seems to take a while so I will reply here later when my account password hopefully is recovered.

    • @brantleygrey696
      @brantleygrey696 2 ปีที่แล้ว

      @Chaim Vance it worked and I now got access to my account again. Im so happy:D
      Thanks so much you saved my ass :D

    • @chaimvance8608
      @chaimvance8608 2 ปีที่แล้ว

      @Brantley Grey Happy to help :D

  • @poojadhonde3169
    @poojadhonde3169 5 ปีที่แล้ว

    Very nice sir,
    Please make a videos for 3d visualization .

  • @kylelarson5074
    @kylelarson5074 6 ปีที่แล้ว

    What would be fantastic is if you could please create 10-15min or less summary videos of your lessons just to provide a snap shot of the different codes. That way it would make it extremely easy to revise your information without needing to sit through the repetition of the more indepth explanations we have already heard.

  • @aikagyan999
    @aikagyan999 3 ปีที่แล้ว

    Thank you so much..

  • @dreznik
    @dreznik 6 ปีที่แล้ว

    you should do geom_boxplot(notch=T) so folks understand the concept of visually comparing medians; also read_csv preferred over read.csv

  • @akakhbod
    @akakhbod 4 ปีที่แล้ว

    Thanks a ton david ...

  • @shubhangisuralkar8436
    @shubhangisuralkar8436 6 ปีที่แล้ว +2

    really helpfull

  • @statisticstime4734
    @statisticstime4734 3 ปีที่แล้ว

    Excellent!

  • @This_aint_rocket_science
    @This_aint_rocket_science 4 ปีที่แล้ว +1

    thanks very informative

  • @JulioCCavalcanti
    @JulioCCavalcanti 3 ปีที่แล้ว

    Very good!

  • @RavinderRam
    @RavinderRam 6 ปีที่แล้ว +1

    ggplot2 best package in data science for visulaization

  • @aks1008
    @aks1008 5 ปีที่แล้ว

    Sir very good video...I just had a doubt ...if we have 8-10 categories instead of 3 for pclass is there an option to select and show the top 5 pclass from the 8-10 categories and plot them.using ggplot...because I work in the aerospace industry and have multiple categories for each variable...thanks
    Amod Shirke

  • @yaoliao3517
    @yaoliao3517 4 ปีที่แล้ว

    Really thanks

  • @sureshkharel1514
    @sureshkharel1514 4 ปีที่แล้ว

    please create more content on prediction and fitting function

  • @atomii9455
    @atomii9455 4 ปีที่แล้ว

    Wow. Thanks

  • @michaelrockinger
    @michaelrockinger 6 ปีที่แล้ว +5

    Show really starts after 25min. You should have discussed passengerid and name when you discussed the variables. Is ggplot smart to use factors for visualization? In a few days i will be desperate to remember that i need to factor to get certain visu. It should be the programmer to have contol not the program. No? Is it really sooo complicated to put % in the plot? Not good publicity for such a great package as ggplot.

  • @ButterfaceGMusicSlump
    @ButterfaceGMusicSlump 5 ปีที่แล้ว +1

    for people that are familiar with R skip to 33:33

  • @sayedmohammad2394
    @sayedmohammad2394 5 ปีที่แล้ว

    start from @24:00

  • @ramenchetia7529
    @ramenchetia7529 5 ปีที่แล้ว

    Hi, I tried running the code for 2nd question w.r.t Sex but still getting the grey bars. The color for survived is not coming. Please help.

  • @tpeldonyou
    @tpeldonyou 6 ปีที่แล้ว

    What is the use of factorise here? I thought factorising some variable was going to be used later in exercise.

  • @manonabrilestades
    @manonabrilestades 5 ปีที่แล้ว

    can i do you one cuestion? is about a graphic that i can't resolve

  • @RanaMuhammadWaqas
    @RanaMuhammadWaqas 4 ปีที่แล้ว

    ggplot(titanic, aes(x = Age)) +
    theme_bw()+
    geom_histogram(binwidth = 5) +
    labs(y="Passanger Count",
    x="Age (binwidth=5)",
    title = "Titanic Age Distribution")
    This doesnt work getting an error
    Error: StatBin requires a continuous x variable: the x variable is discrete. Perhaps you want stat="count"?
    >

  • @the29couple7
    @the29couple7 6 ปีที่แล้ว

    Hi I have started working with ggplot2 recently
    install.packages("ggplot2")is working fine but while using library(ggplot2) I am having below and can't come out.
    library(ggplot2)
    Error: package or namespace load failed for ‘ggplot2’:
    object ‘enexprs’ is not exported by 'namespace:rlang'
    In addition: Warning message:
    package ‘ggplot2’ was built under R version 3.4.4
    Your help is highly needed

  • @LucasSantos-sh8tu
    @LucasSantos-sh8tu 5 ปีที่แล้ว

    Hi! How can I change the graphic color? I created the graphic but I don't want it to have only the basic colors of the ggplot2, like this pink and blue colors. I did a little research on google, but I only find how I can change the color on graphics that has continous variables and my variables are discrets. Can you help me?

  • @dmtv_5559
    @dmtv_5559 6 ปีที่แล้ว

    Please how do I display equation of the line and r^2 on my plots in R? In excel it is very easy to do this. I am buying into R because of R markdown. Please help out as I need my equation displayed just the way I use to in excel

    • @rexevan6714
      @rexevan6714 6 ปีที่แล้ว

      Liberty Mgbanyi you can use anotation function and use Paste0 to show the r^2 and equation. If you have r^2 and equation for every facet, you would need to make more variable / column.

  • @saurabhyelmame
    @saurabhyelmame 3 ปีที่แล้ว +1

    Why the column names are converted to factors in 32:38 ?
    I tried using them without converting to factors.
    Some were executed but some gave errors.

    • @HarshPatel-ph1lp
      @HarshPatel-ph1lp 3 ปีที่แล้ว

      I think if you don't convert them into factors (which are basically categorical variables), then R will think them just as a string of words rather than recurring categories.

  • @schraudi5763
    @schraudi5763 3 ปีที่แล้ว

    Isn't there a mistake in the axis description of the last histogram?

  • @didimoescobar2247
    @didimoescobar2247 5 ปีที่แล้ว

    I don´t read the CSV file.....please help..

  • @georgemathematician1186
    @georgemathematician1186 2 ปีที่แล้ว

    Pretty cool

  • @josephmakambala6395
    @josephmakambala6395 2 ปีที่แล้ว +1

    I need some help with my R language biostatistics and I'm glad to pay the affordable R tutorial through Zoom or any other platform.

    • @Datasciencedojo
      @Datasciencedojo  2 ปีที่แล้ว

      Hello Joseph, do check out our free course on R: online.datasciencedojo.com/course/R-Programming

  • @vincenzo4259
    @vincenzo4259 2 ปีที่แล้ว

    Thanks

    • @Datasciencedojo
      @Datasciencedojo  2 ปีที่แล้ว

      Keep following us for more tutorials.

  • @mehriban9003
    @mehriban9003 4 ปีที่แล้ว

    I wanted to practice with the file while watching this video. But I haven't been able to download the file that he mentions as 'easily downloadable'. I wish he made the file available for downloading on github.

  • @punchline9131
    @punchline9131 4 ปีที่แล้ว

    I would like to use "ggplot2" to create a graph showing whether life satisfaction returns to the value it had before the unemployment event occurred.
    Unemployment I have coded with 0 = not unemployed and 1 = unemployed. General life satisfaction is coded 0 - 10.
    I have already created the data set df_emp with a subset command, which contains all persons who were unemployed at least once. It also contains all years of observation of these persons. So all years before, during and after unemployment, as well as the corresponding values for life satisfaction
    The years before, during and after the event should now be entered on the x-axis. Where 0 is the event unemployment. The values -1 -2 and 1 2 etc. show values for the years before and after unemployment.
    On the y-axis the values for life satisfaction should then be deducted (centered).
    Unfortunately, I can't manage to model this graphic in R and would therefore be very pleased if somebody could help me or give me some tips on how to proceed.
    Best regards
    ps. I got the graphic from Lucas et al. (2004) - Unemployment alters the set point for life satisfaction

  • @kmlim4657
    @kmlim4657 5 ปีที่แล้ว +1

    Thanks for great lecture, and I'm wondering in the lecture around 1:05:58, the code indicates that aes(x is Age), but the label says it is Survived. Shouldn't it be flipped? labs(y = Survived, x = Age ...)

  • @hasibulislam2187
    @hasibulislam2187 6 ปีที่แล้ว

    in 42:30min, you have a color on you bars. but with same code, my bars are having the same color. Why? Please give me a solution. Thanks in advance

    • @RobertFisher
      @RobertFisher 6 ปีที่แล้ว

      You likely forgot to set up the factors. Go back to around 32:00 in the video for the explanation.

    • @hasibulislam2187
      @hasibulislam2187 6 ปีที่แล้ว

      Yes. I have done after making comment. Thanks for replying.

  • @sharanyasredharan9625
    @sharanyasredharan9625 3 ปีที่แล้ว

    Hello..the titanic.csv dataset is stored in code.datasciencedojo..May I please know where or how else can I retrieve the file as I would like to go through these class? Thank you.

  • @pravinmsp8025
    @pravinmsp8025 5 ปีที่แล้ว

    I need the excel or csv file to practice

    • @Datasciencedojo
      @Datasciencedojo  5 ปีที่แล้ว

      You can find our supplemental material here: code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Data%20Visualization%20with%20R%20and%20ggplot2

  • @robinredhu1995
    @robinredhu1995 6 ปีที่แล้ว +4

    I think in last two graphs both density and histogram are wrongly labeled

    • @headhunterz1000
      @headhunterz1000 6 ปีที่แล้ว

      Would you elaborate on why this is?

    • @juanpablorb7150
      @juanpablorb7150 5 ปีที่แล้ว

      I was thinking the same. It doesn't make sense to label the y axis as survived, but rather frequency? while the x axis should just be age.

    • @FRANCESCO-wj8rs
      @FRANCESCO-wj8rs 5 ปีที่แล้ว

      I also believe it is wrong. It should be age on the x-axis.

    • @aakash4488
      @aakash4488 5 ปีที่แล้ว

      The age is on the x-axis and survived is on the y-axis.

  • @bikashpokharel478
    @bikashpokharel478 3 ปีที่แล้ว +1

    Wait a minute, I am gonna update my linked In bio to R expert

  • @KK-ty6ez
    @KK-ty6ez 3 ปีที่แล้ว

    DATA SCIENCE DOJO Piliz share the R code , my not running properly

  • @ahsanamrirohman6865
    @ahsanamrirohman6865 4 ปีที่แล้ว

    Install equiser package

  • @djangoworldwide7925
    @djangoworldwide7925 3 ปีที่แล้ว

    That video was fantastic. I now know i should be a girl if i want to go on a cruise

  • @kakekmesum4757
    @kakekmesum4757 4 ปีที่แล้ว

    tq dojo

  • @jovial129
    @jovial129 3 ปีที่แล้ว

    Histogram 54:44

  • @TheWankersunited
    @TheWankersunited 3 ปีที่แล้ว

    what a load of unnecessary talking.... 25 minutes of not needed introduction. 5 minutes to tell why passengerID and name are not relevant.... get to the point already...