The Data Digest
The Data Digest
  • 42
  • 238 608
EURO 2024: Analyzing the group phase matches by market value
In this video, I analyze the group phase matches of Euro 2024 by market value.
Did the more valuable team always beat the team of lower market value.
Which team surprisingly continued to the knock-out phase?
Link to flourish interactive chart:
flourish-user-preview.com/17991450/TYJ5Ze021DQxj-2EhIiqbN1PWJDICBHYhIOHgQahahPRlmCBmHb24TqzUabav0fW/
มุมมอง: 927

วีดีโอ

I finished the "Productive R Workflow" course. Here is my review!
มุมมอง 210หลายเดือนก่อน
In this video I want to share with you my experience with the Productive R Workflow course that was created by Yan Holtz. www.productive-r-workflow.com/ DISCLAIMER: This video is not payed, sponsored by the creator of the course. It is my honest and unbiased opinion. I am also not affiliated with him or his work but I know the creator from his other websites and follow him on twitter: r-graph-g...
ALL Titled Tuesday Chess Tournament WINNERS (2014-2024)
มุมมอง 2053 หลายเดือนก่อน
There have been almost 400 Titled Tuesday Blitz Chess Tournaments. And in this animation I show all the winners. The first tournament was won on December 1st 2014 by Daniel Naroditsky. Data source: www.chess.com/tournament/live/titled-tuesdays Animation: flourish.studio/
US Edition of Titled Tuesday 2023 Analysis
มุมมอง 1264 หลายเดือนก่อน
Watch the US Edition of my 2023 Titled Tuesday Blitz Chess Tournament Analysis. I focus on the most common match-ups of the top US players and their results and investigate correlations between their ratings, scores and games played on average. ⏱ Time Stamps ⌚ 0:00 - Intro 0:19 - Hikaru's results vs most common opponents 1:12 - Table of content 1:55 - US participation in Titled Tuesday Tourname...
Analyzing all Titled Tuesday Chess Tournaments from 2023
มุมมอง 3.6K4 หลายเดือนก่อน
I often watch Chess tournaments on TH-cam, so I was curious to analyze a whole year of blitz games played in the Titled Tuesday Tournaments, sponsored by @chess.com, where some of the worlds best chess players participate. ⏱ Time Stamps ⌚ 0:00 - Intro 1:45 - Data import and setup steps 2:38 - Data structure and variables 4:31 - What was the missing tournament 6:06 - Number of players participat...
ALL 53 ggplot2 GEOMS shown in R
มุมมอง 2.2K10 หลายเดือนก่อน
Did you know that there are 50 different "geoms" included in the ggplot2 package? How many can you name before watching? I hope you enjoy watching all of them in this one video. ⏱ Time Stamps ⌚ 0:00 - intro 1:00 - point, jitter 2:10 - boxplot, violin, dotplot 3:55 - bar, col, 5:30 - histogram, freqpoly, density 7:30 - line, step, path, area 10:53 - hline, vline, segment, rect 13:27 - abline, sm...
How to Create Pie Charts in R (6 easy ways)
มุมมอง 6K11 หลายเดือนก่อน
In this video, I will show you how to create pie charts in R with different functions that often come from special packages. ⏱ Time Stamps ⌚ 0:00 - Intro 1:01 - graphics::pie 4:57 - ggplot::geom_col coord_polar 9:29 - ggpattern 11:55 - ggpubr::ggpie 13:38 - plot_ly(type = "pie") 15:17 - plotrix::pie3D 16:01 - webr::PieDonut External Links: coolbutuseless.github.io/package/ggpattern/articles/geo...
How to create Multi-Panel plots in R with facet_wrap() and facet_grid()
มุมมอง 3.9K11 หลายเดือนก่อน
In this video, I will show you how to use facet_wrap or facet_grid to create multi-panel plots in R and ggplot. Faceting plots can be very helpful to compare your data over different categories. It can also help you to get a quick overview of your data across multiple variables if you use a special pivot longer trick. ⏱ Time Stamps ⌚ 0:00 - Intro 1:20 - gapminder example 5:09 - facet_wrap argum...
4 Examples of Using R Functions for Exploratory Data Analysis (EDA)
มุมมอง 4.7Kปีที่แล้ว
In this video, I will show you four examples of Exploratory data analysis (EDA) using R. EDA is a critical data analysis technique that can help you identify important insights in your data. I will explain the most important functions, that will equip you to identify important patterns in your data and make better decisions based on it. ⏱ Time Stamps ⌚ 0:00 - Intro 1:28 - mtcars 10:47 - gapmind...
Three Important Lessons You can Learn from Anscombe's Datasets
มุมมอง 620ปีที่แล้ว
In this video I demonstrate three important lessons that the Anscombe's Quartet can teach you in R. I will show you different ways to calculate summary statistics. From basic indexing and function calls to apply and for loops and useful dplyr functions like pivot_longer, group_by and summarize. Visualizing datasets is crucial to spot outliers or detect the real pattern and relation ship of vari...
Ultimate Guide to Colors in R
มุมมอง 2.4Kปีที่แล้ว
In this video I want to show you how you can change colors in R. I start with the basic functions and then give some advanced examples. ⏱ Time Stamps ⌚ 0:00 - Intro 0:45 - base::plot 7:37 - mapping colors with ggplot() 13:45 - color based on conditional 15:20 - scale_color/fill() functions 23:45 - viridis and RColorBrewer() 27:43 - show_col() 28:28 - rgb() 30:15 - col2rgb() 31:16 - colorRampPal...
R Beginner? Start Here and Learn over 100 basic R functions!
มุมมอง 3.6Kปีที่แล้ว
In this video I want to show you over 100 basic but extremely useful R functions that every R user should know. It should be essential for beginners to have at least heard of them but also experienced R users might find some extra tips in this video. ⏱ Time Stamps ⌚ 0:00 - Intro 1:09 - combine c() and str() 2:56 help() and assign operator 4:41 - indexing with [] and $ 8:04 - colon operator and ...
Powerful R Functions Every Data Analyst Should Know
มุมมอง 2.2Kปีที่แล้ว
In this video I show you how you can quickly calculate summary statistics in R for one or more categorical variables of the diamonds dataset. The summarize() function can be used to calculate many different results for one continuous variable. group_by() allows you to replicate these results for different levels of a categorical variable. And with pivot_wider() you can calculate a result of int...
How to Create Parallel Plots in R with geom_line() and ggparcoord()
มุมมอง 3.5Kปีที่แล้ว
In this tutorial I will show you how to draw multiple lines in R with the geom_line() function and the ggparcoord() function. You can use these to highlight a group of lines for certain categories or highlight an individual response. Sometimes these chart types are also called Spaghetti plots. I will also go through the function arguments from the ggparcoord function that is provided by the GGa...
How to use the 4 different norm() functions in R
มุมมอง 13Kปีที่แล้ว
How to use the 4 different norm() functions in R
Analyzing Monkeypox Cases in R for Beginners
มุมมอง 1.4K2 ปีที่แล้ว
Analyzing Monkeypox Cases in R for Beginners
How to add multiple pictures to ggplot in R with image_read and patchwork
มุมมอง 1.6K2 ปีที่แล้ว
How to add multiple pictures to ggplot in R with image_read and patchwork
How to Create Bubble Charts in R with geom_point() and scale_size()
มุมมอง 6K2 ปีที่แล้ว
How to Create Bubble Charts in R with geom_point() and scale_size()
How to Create Correlation Plots in R
มุมมอง 41K2 ปีที่แล้ว
How to Create Correlation Plots in R
How to Create Heatmaps in R with the geom_tile() and heatmap() functions.
มุมมอง 3.9K2 ปีที่แล้ว
How to Create Heatmaps in R with the geom_tile() and heatmap() functions.
Animating the Datasaurus Dozen Dataset in R
มุมมอง 1.1K2 ปีที่แล้ว
Animating the Datasaurus Dozen Dataset in R
Line charts and Connected Scatterplots in R with geom_line() and geom_path()
มุมมอง 2.9K2 ปีที่แล้ว
Line charts and Connected Scatterplots in R with geom_line() and geom_path()
Scatterplots in R with geom_point() and geom_text/label()
มุมมอง 12K2 ปีที่แล้ว
Scatterplots in R with geom_point() and geom_text/label()
Contour plots in R with geom_density_2d/filled() and geom_bin2d() [R- Graph Gallery Tutorial]
มุมมอง 4.6K2 ปีที่แล้ว
Contour plots in R with geom_density_2d/filled() and geom_bin2d() [R- Graph Gallery Tutorial]
Ridgeline plots in R with geom_ridgeline() and geom_density_ridges() [R- Graph Gallery Tutorial]
มุมมอง 5K2 ปีที่แล้ว
Ridgeline plots in R with geom_ridgeline() and geom_density_ridges() [R- Graph Gallery Tutorial]
From which Leagues do the Players of the EURO 2020 come from? Chord Diagram Visualization in R
มุมมอง 8302 ปีที่แล้ว
From which Leagues do the Players of the EURO 2020 come from? Chord Diagram Visualization in R
Boxplots in R with ggplot and geom_boxplot() [R- Graph Gallery Tutorial]
มุมมอง 22K3 ปีที่แล้ว
Boxplots in R with ggplot and geom_boxplot() [R- Graph Gallery Tutorial]
How to reorder factors in R (the easy way)
มุมมอง 7K3 ปีที่แล้ว
How to reorder factors in R (the easy way)
Histograms in R with ggplot and geom_histogram() [R-Graph Gallery Tutorial]
มุมมอง 18K3 ปีที่แล้ว
Histograms in R with ggplot and geom_histogram() [R-Graph Gallery Tutorial]
Density Plot in R with ggplot and geom_density() [R-Graph Gallery Tutorial]
มุมมอง 13K3 ปีที่แล้ว
Density Plot in R with ggplot and geom_density() [R-Graph Gallery Tutorial]

ความคิดเห็น

  • @arroe8386
    @arroe8386 5 วันที่ผ่านมา

    Would be interesting to do an analysis adjusted for age. It's accessible to look up the current values, but since market values are about what a player could potentially offer you in the future, I wonder how it would change if we put in a factor based on age to model the squad quality at the moment.

    • @TheDataDigest
      @TheDataDigest 5 วันที่ผ่านมา

      Absolutely. For example the German Goalkeepers are quite similar in skill but ter Stegen age 32y is worth €28mio vs Neuer age 38y €4m. I am planning to do a bigger analysis with all 622 players where I also show the R code. There I will consider age, games played and other factors. Also, Mbappe worth €180m moved from Paris to Madrid this year without that payment because his contract ended.

    • @arroe8386
      @arroe8386 5 วันที่ผ่านมา

      @@TheDataDigest cool 👍🏻

  • @GengarEdit
    @GengarEdit 6 วันที่ผ่านมา

    Gutes Video danke dir 😊

    • @TheDataDigest
      @TheDataDigest 6 วันที่ผ่านมา

      Guter Kommentar :) Besten Dank zurück. Hatte mich einfach interessiert wie aussagekräftig der Marktwert bei den Ergebnissen ist.

  • @muhammadaqil2869
    @muhammadaqil2869 7 วันที่ผ่านมา

    best instructor

    • @TheDataDigest
      @TheDataDigest วันที่ผ่านมา

      Thanks for watching my video and the nice compliment. I think I can still improve but I am glad you found the instructions helpful.

  • @statistic8362
    @statistic8362 9 วันที่ผ่านมา

    Buenísimo ❤🙌🏻

  • @Crass1000
    @Crass1000 18 วันที่ผ่านมา

    Great example. Is it possible to include a median (or another) value in each plot instead of quantile?

    • @TheDataDigest
      @TheDataDigest 18 วันที่ผ่านมา

      Can you give me a timestamp to the plot you are referring to? You can always add a line with "+ geom_vline(xintercept = median(data$column))". Or geom_hline(yintercept = ...) for horizontal lines.

  • @XXTargaryenXX
    @XXTargaryenXX 24 วันที่ผ่านมา

    Fantastic! It helped a lot

  • @91lovemusic
    @91lovemusic 26 วันที่ผ่านมา

    14:19 i still fail doing that chart 😅

    • @TheDataDigest
      @TheDataDigest วันที่ผ่านมา

      What error message are you getting? Did you load the tidytext package to have the reorder_within() function available?

  • @Dataciiiiiiiii
    @Dataciiiiiiiii หลายเดือนก่อน

    if do clustering by kmeans then,i assign labels , and can use it to demonstrate how features affect our clustering? ill think abt use this on my report thanks sir , im new to all of these

  • @kristianfagerstrom7011
    @kristianfagerstrom7011 หลายเดือนก่อน

    I'm looking for % of TTuesday games decided on time. Is that data in there? I can't see it in the timelines titles.

    • @TheDataDigest
      @TheDataDigest หลายเดือนก่อน

      This information was not included in the tournament csv files that I analyzed for this video. There are however files for each tournament that contain the game information. I am also interested in such questions like "flagging %" and decisive games depending on titles and rating differences. And the bongcloud opening or the cow :) But to analyze these huge text files I have to write some specific functions first (or let ChatGPT do it) :D

    • @kristianfagerstrom7011
      @kristianfagerstrom7011 หลายเดือนก่อน

      @@TheDataDigest Thanks for the swift answer! I am not able to sift through that data myself, so I was hoping that someone had done the job for me ;-) I am curious because of the "clash of claims" match - a comparison between 3+1 and 3+2 flagging rate would be interesting. I suspect that Kramnik wants 3+2 because he's afraid of flagging, and that makes the "clash" a bit weird since it's not titled tuesday format. But without data I am disinclined to propose that theory.

    • @TheDataDigest
      @TheDataDigest หลายเดือนก่อน

      @@kristianfagerstrom7011 Titled Tuesday is 3+1, right? Would you have examples of 3+2 tournaments where I could download some games to check your hypothesis? I want to look into these game analyses very soon.

    • @kristianfagerstrom7011
      @kristianfagerstrom7011 หลายเดือนก่อน

      @@TheDataDigest Yes, TT is 3+1, and no, sorry I don't have any tournament data, but maybe chess.com would be willing to share?

  • @manoelbastosfreirejunior2068
    @manoelbastosfreirejunior2068 หลายเดือนก่อน

    Hello Professor Alex, Could you please make the script of this class available? Thank you for the explanation: Manoel - Maceió/Alagoas/Brasil

  • @m.shihamadam7664
    @m.shihamadam7664 หลายเดือนก่อน

    Thank you for posting this video - well explained!

  • @terrychamulak3557
    @terrychamulak3557 หลายเดือนก่อน

    Without exception the BEST introduction to R video anywhere. Tutorial is very well done, thank you for sharing your knowledge and wisdom in R. This is a MUST for anyone interested and serious about learning R.

    • @TheDataDigest
      @TheDataDigest หลายเดือนก่อน

      Wow, what a great compliment. Glad you liked. It took me a long time to make and I am happy when people find it useful, share it and leave a comment.

  • @Geology_monster
    @Geology_monster หลายเดือนก่อน

    bro i love u

  • @lanehampton5655
    @lanehampton5655 หลายเดือนก่อน

    ☀️ 'promosm'

  • @corylowe5557
    @corylowe5557 หลายเดือนก่อน

    Awesome!

  • @ClaisomWest
    @ClaisomWest 2 หลายเดือนก่อน

    Absolute life saver of a video. I'm desperately trying to do my stats project, which requires R, and I'm dogshite with programming stuff. I've actually managed to make something presentable.

    • @TheDataDigest
      @TheDataDigest 2 หลายเดือนก่อน

      Glad you found my video helpful then :) If you need more help with your stats project you can always reach out to me via: question@thedatadigest.email

  • @wildlifeireland9514
    @wildlifeireland9514 2 หลายเดือนก่อน

    Brilliant detail

    • @TheDataDigest
      @TheDataDigest 2 หลายเดือนก่อน

      Thanks. It is my best performing video so far. Maybe I should make one about ggpairs which is also a good first visualization package.

    • @wildlifeireland9514
      @wildlifeireland9514 2 หลายเดือนก่อน

      Absolutely, if it's not too much hassle....they're of great help

  • @MahmudulHasan-gp9qm
    @MahmudulHasan-gp9qm 2 หลายเดือนก่อน

    he data must be given as dataframe. The data must be given as dataframe. The data must be given as dataframe. The data must be given as dataframe. The data must be given as dataframe. The data must be given as dataframe- pleaase help me fix it

    • @TheDataDigest
      @TheDataDigest 2 หลายเดือนก่อน

      Use the str() function on your data object, like "str(my_data)" and tell me what it says. It probably is not a data.frame. But you can turn it into a data.frame with the "as.data.frame(my_data)" or simply "data.frame(my_data)" functions.

  • @charleydublin7304
    @charleydublin7304 2 หลายเดือนก่อน

    Excellent video - thank you

    • @TheDataDigest
      @TheDataDigest 2 หลายเดือนก่อน

      Thanks for the compliment :)

  • @DomenicaJosefina
    @DomenicaJosefina 2 หลายเดือนก่อน

    Could you tell the names of all the libraries you use? :)

    • @TheDataDigest
      @TheDataDigest 2 หลายเดือนก่อน

      I think in this video I only used ggplot2 library. It is part of the tidyverse package, which also includes dplyr. I also often use forcats (to work with factors), lubridate (to work with dates), and stringr (to manipulate strings). I also saw this tweet today about using ggdist package for distribution plots (twitter.com/MatthewBJane/status/1780952366541385910). Hope that helps :)

  • @lizongzhang
    @lizongzhang 3 หลายเดือนก่อน

    learned a lot! magnetic voice!

  • @AnseloSilver
    @AnseloSilver 3 หลายเดือนก่อน

    I have been trying to install the "hrbrthemes" package and it does not work and I can use the theme_ipsum() I did some search online but not the suggested solution don't work

  • @AnseloSilver
    @AnseloSilver 3 หลายเดือนก่อน

    I like this tutorial, it has a good pace and explanation going along with the step. I have seen some tutorials going too fast.

    • @TheDataDigest
      @TheDataDigest 2 หลายเดือนก่อน

      Feedback like this is highly appreciative. Thanks for leaving a comment. I do cut out all unnecessary speech so my videos are actually quite fast. But I want to show a lot and one can always rewind and watch again.

  • @mxm8900
    @mxm8900 3 หลายเดือนก่อน

    thanks for the video👌

    • @TheDataDigest
      @TheDataDigest 3 หลายเดือนก่อน

      Should I make an update for 2024?

  • @j7andrew
    @j7andrew 3 หลายเดือนก่อน

    Spectacular video. Thank you

    • @TheDataDigest
      @TheDataDigest 3 หลายเดือนก่อน

      Glad you liked it and left a comment. Helps the channel with engagement and promotion.

  • @federinik33
    @federinik33 4 หลายเดือนก่อน

    Thank you for your work, using your data everyone is now able to check "interesting" stats for themselves. I believe that your tools will greatly help to perform analysis like the one of the former world chess champion @VBKramnik. Cheating is a serious issue and it is difficult to detect it, data from past tournaments should show anomalies and therefore it should be used to guarantee fair play.

  • @hamedrajhi
    @hamedrajhi 4 หลายเดือนก่อน

    Another banger content. Great work!👏🏼

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      What a cool first comment :) Thank you very much. Glad you liked it.

  • @miguel-espinoza
    @miguel-espinoza 4 หลายเดือนก่อน

    This is seriously impressive! Well done!

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Thank you so much for leaving a comment. I am glad you liked it. It was definitely fun to do the analysis.

  • @mrmountain781
    @mrmountain781 4 หลายเดือนก่อน

    Thank you for the guidance. Even step by step to understand better for the coding, it is very helpful!

  • @taborsmrcna
    @taborsmrcna 4 หลายเดือนก่อน

    Very cool analysis! I am looking forward to the second part where you go beyond the descriptive statistics!

  • @kilte90
    @kilte90 4 หลายเดือนก่อน

    amazing! statistics are fun. and its great to have someone doing the work i wish i was capable of! TY

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      I think you are capable! If you are curious about a topic, or like to learn some form of data analysis there is always a simple step to be taken to get a little bit better. Start with some easy text books that have loads of examples. Or listen to david robinsons tidy tuesday screencasts on youtube.

    • @kilte90
      @kilte90 4 หลายเดือนก่อน

      Well, thank you :) But in all honesty i don't think i could. My intelligence is almost decent, but the work you put in, the research and hours. it's also beyond me.@@TheDataDigest

  • @luciangv3252
    @luciangv3252 4 หลายเดือนก่อน

    will love to watch extremly data like, wich FM player are playing like GM and how many games wins vs them. In lucky day a average player (FM, IM, GM) how many points get vs a extremly player.. how many sigma are the extreme players above of the average with sam title... how often a player with less elo win vs high elo ith a diference of 300 points elo.. what theory (elo) said abouta that

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Oh wow, these are really good suggestions. I like the rating comparisons by different brackets of difference. And whether it makes a difference who has the white pieces. I think this would be great for some models. I will look into it!

  • @TheDataDigest
    @TheDataDigest 4 หลายเดือนก่อน

    Raw data and RDS files and analysis scripts can be found and downloaded here: github.com/TheDataDigest/Chess/tree/main/input Have fun analyzing and exploring all 2023 Titled Tuesday Tournaments.

  • @rrohit1713
    @rrohit1713 4 หลายเดือนก่อน

    Can you please share the dataset with the community? I would like to work on it as well

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Yes of course. @Kivallara also asked for it. I am currently moving it into my GitHub and then pin the link to it in the top comment.

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Dataset can be found here: github.com/TheDataDigest/EDA/tree/main/Chess Have fun analyzing.

    • @rrohit1713
      @rrohit1713 4 หลายเดือนก่อน

      @@TheDataDigest thanks a lot and keep doing the amazing things

  • @DhirajBarnwal
    @DhirajBarnwal 4 หลายเดือนก่อน

    Great work! Can you share the final consolidated dataset on GitHub or Kaggle so that others can do more analysis on top of this?

  • @u.v.s.5583
    @u.v.s.5583 4 หลายเดือนก่อน

    You should have counted how many players marked by GM Kramnik as cheaters participated in each tournament and how many events they won.

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      I missed that controversy but found some interesting articles discussing the matter. Thanks for bringing it up. Now I better understand another comment that mentioned Kramnik and Hikaru before. But the winning streak analysis did not show any big surprises. Six players with 12 games, Hikaru with 15 and Magnus with 17. Seems fine given that Magnus is the GOAT and Hikaru has an incredible Blitz rating and participates so many times in TT.

  • @Patrick462
    @Patrick462 4 หลายเดือนก่อน

    Amazing! great analysis, and even greater explanation of how each result was determined!

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Glad you liked it Patrick and thank you for leaving a comment.

  • @tristansnow
    @tristansnow 4 หลายเดือนก่อน

    Women are not limited to WGM, WIM, etc. These are additional titles only available to women. So any calculations would only express the share holding that subset of titles, and wouldn't be a proxy for gender. For example, there are female IMs and GMs that participate in TTs. I loved this video, and will definitely check out your channel!

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Hi Tristan, first of all, you are correct. Second of all, thanks for leaving such a nice comment. I am glad you enjoyed the video. I did some more analysis and only found two cases of players both playing und men and women titles. 1) Meri-Arabdize (from GEO) participated 26x as WGM and 8x as IM. 2) Jiner Zhu (from CHN) participated 10x as WGM and 5x as GM. The best placed women gets $100 in each tournament, so I assume that most women will play under the women title to qualify for that. Unless chess.com has another way of know or they don't care that much about the prize money.

  • @TNBM4Life
    @TNBM4Life 4 หลายเดือนก่อน

    Unless I missed it, did you not calculate what was the probability, or % win rate I guess you could say (out of the ones who have won TT in 2023) of the players winning the TT? I guess its something as simple as dividing the number of TT wins they have by the number of total times they entered? Thats what I was expecting to see when i clicked, knowing Magnus probably participated in less TT than Hikaru. So of course while number of TT wins is interesting, unless it is relative to number of tournaments played it doesnt tell me all that much? And how often someone gets top 3 etc would also be interesting.

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      @40:42 I briefly show the winning percentage but I forgot to make a chart. Is it okay when I list the top 10 highest winning percentage with (wins/participations) in parenthesis? Below I will answer the same with the chance to place in the top 3. 1) Liem Le, VNM: 30% (3/10) 2) Hikaru Nakamura, USA: 24.3% (18/74) 3) Magnus Carlsen, NOR: 23.1% (9/39) 4) Maxime Vachier-Lagrave, FRA: 20% (5/25) 5) Platon Galperin, UKR: 16.7% (1/6) 6) Eduardo Iturrizaga, ESP: 14.3% (1/7) 7) Shakhriyar Mamedyarov, AZE: 12.5% (1/8) 8) Daniil Dubov, RUS: 12.2% (5/41) 9) Alexander Grischuk, RUS: 11.1% (3/27) 10) Nihal Sarin, IND: 1.11% (5/45)

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Cool to see that Caruana and Artemiev make it in the highest % to place as top3. Thanks for the extra request/question. 1) Susanto Megaranto, IDN: 66.7% (2/3) 2) Magnus Carlsen, NOR: 46.2% (18/39) 3) Hikaru Nakamura, USA: 43.2% (32/74) 4) Eduardo Iturrizaga, ESP: 42.9% (3/7) 5) Liem Le, VNM: 40% (4/10) 6) Khumoyun Begmuratov, UZB: 33.3% (1/3) 7) Maxime Vachier-Lagrave, FRA: 32% (8/25) 8) Fabiano Caruana, USA: 26.8% (11/41) 9) Vladislav Artemiev, RUS: 25% (1/4) 10) Aram Hakobyan, ARM: 24% (12/50) I should probably also do a top 5 which stands for % winning prize money.

  • @Tom-hz1kz
    @Tom-hz1kz 4 หลายเดือนก่อน

    You said you calculated the "average score needed to win" but what you actually calculated could more accurately be described as "average score of the winner". There are a number of tournaments where winners did not actually need to get the score they got in order to win the tournament, as discussed later in the video: Hikaru and Magnus did not need 11 points to win the tournaments, 9 points and the right tiebreak would have been enough. If you would really try to calculate the "average score needed to win" then it should be calculated based on how many points the 2nd player got because all that was needed to win was the same number of points plus a better tiebreak.

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Great catch. Very precise use of language, but you are correct. So in order to show the average of what I meant I would have to look at the score of number 2 and then add half a point. Or when there is a tight live with the fact that the Sonnenborn-Berger Score will resolve this.

  • @michal_kowal
    @michal_kowal 4 หลายเดือนก่อน

    Just waiting for Kramnik or Nakamura to use this video for their Interesting analysis...😅

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      I would love that. I am actually planning to analyze all games to find out what the most common opening was and the rarest one and how often a player wins on time etc. Edit: I first did not understand the reference and the matching emoji. But another comment made me aware of certain accusations 😅

    • @u.v.s.5583
      @u.v.s.5583 4 หลายเดือนก่อน

      Cheating Tuesdays FTW!

  • @B2theENJAMIN
    @B2theENJAMIN 4 หลายเดือนก่อน

    great video. fyi there are women with "male" titles i.e. the top women achieve a rating of 2400 plus norms

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      You are of course right. Judit Polgar comes to mind. I checked the data again and I could only find two women that are listed with women and men titles. Maybe there are some that use the men title only but I wouldn't be sure from the name alone. I found a list of 41 women with GM title and did some spot check of the most recent ones but could not find their name in the data set. The 2 I found are: Meri-Arabidze (GEO) with WGM and IM and a best rating of 2745 Jiner Zhu (CHN) with WGM and GM and a best rating of 2692.

  • @_chrismis_7964
    @_chrismis_7964 4 หลายเดือนก่อน

    Great programming skills! I think you could make an extra video just showing the graph/ statistics, so that it would be more appealing for the brighter chess community :)

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Hi there. Very good suggestion. I was thinking about several Shorts that just go through some of the results. Maybe with some interactive charts. But you are right, maybe a short ~5 min video would be best for that.

    • @_chrismis_7964
      @_chrismis_7964 4 หลายเดือนก่อน

      @@TheDataDigesthell yeah 🦅

  • @marendor9087
    @marendor9087 4 หลายเดือนก่อน

    Awesome video! Could you share the name of the player who has a rating of approximately 3125 ELO and really high average points, as depicted on the thumbnail graph?

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Yes I can, here are the top 10 players, that participated at least 5 times in 2023, sorted by average rating with their average blitz rating and N-participations: 1. Hikaru Nakamura: 8.64 | 3289 | 74 2. Wesley So: 8.59 | 3102 | 22 3. Magnus Carlsen: 8.51 | 3268 | 39 4. Dmitry Andreikin: 8.35 | 3055 | 84 5. Denis Lazavik: 8.25 | 3049 | 40 6. Yu Yangyi: 8.14 | 3051 | 7 7. Bogdan Daniel Deac: 8.11 | 3033 | 59 8. David Navara: 8.08 | 2943 | 6 9. Aleksei Sarana: 8.06 | 3066 | 74 10. Jan-Krzysztof Duda: 8.04 | 3030 | 60

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      I will soon share the data and scripts on GitHub, then you could answer these kind of questions yourself, if you download R/R-Studio, which is free. Only if you are interested in data analysis of course :)

    • @marendor9087
      @marendor9087 4 หลายเดือนก่อน

      Thanks, that one dot really caught my interest, turns out it was Wesley So. Once again, great video keep it up 😁

    • @Kivallara
      @Kivallara 4 หลายเดือนก่อน

      @@TheDataDigest Well done video and looking forward to try to analyze the data myself. What's your github?

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      @@Kivallara github.com/TheDataDigest/EDA/tree/main/Chess But I also pinned it as top comment.

  • @hamedrajhi
    @hamedrajhi 4 หลายเดือนก่อน

    I wasn’t expecting much when I clicked. But man you kept delivering more and more. Well done. Subscribed and will definitely watch everything you produce 👏🏼👏🏼

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      Thank you so much for your comment. I am glad you enjoyed it. It was actually performing really poorly in the first days with regards to click-through rate and views, but now it seems that the algorithm is slowly starting to find potential viewers that might enjoy it. So again, thanks for engaging by leaving a comment and subscribing.

  • @floriansitte-kratzsch7355
    @floriansitte-kratzsch7355 4 หลายเดือนก่อน

    Erster. Good job after a long pause!

    • @TheDataDigest
      @TheDataDigest 4 หลายเดือนก่อน

      6 month or 183days exactly :)

  • @ismailsani3382
    @ismailsani3382 4 หลายเดือนก่อน

    Any video on regional climate data download, analysis and prediction in R please. others who have may help please, many thanks

  • @ismailsani3382
    @ismailsani3382 4 หลายเดือนก่อน

    you are second to non. permit me to say you are the R

  • @suying-meow
    @suying-meow 4 หลายเดือนก่อน

    Thanks for your tutorial, it will be better if you can add more details about what each code means.

  • @lucy6692
    @lucy6692 5 หลายเดือนก่อน

    What do I do if I want to assign shapes to more than 6? how do I assign them individually?

    • @TheDataDigest
      @TheDataDigest 5 หลายเดือนก่อน

      Hi Lucy, thanks for leaving a comment. If you run the following code: test_df <- data.frame(x = 1:10, y = 1:10, shapes = LETTERS[1:10]) ggplot(data = test_df, mapping = aes(x=x, y=y, shape = shapes)) + geom_point(size = 3) You get this warning message and now shapes for the letters H-J. Warning messages: 1: The shape palette can deal with a maximum of 6 discrete values because more than 6 becomes difficult to discriminate; you have 10. Consider specifying shapes manually if you must have them. You can manually overwrite this with the "scale_shape_manual()" function: ggplot(data = test_df, mapping = aes(x=x, y=y, shape = shapes)) + geom_point(size = 3) + scale_shape_manual(values = 1:10) However I recommend, if possible to either group your levels, or work with colors. Like using 4 shapes but with 2 different colors for a combination of 8. Or work with facet wrap to split your complexity into multiple charts.