violin plots should not exist

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ก.ย. 2023
  • Violin plots are never the best version of a plot. They are hard to read and bad.
    Violinplot: www.stat.cmu.edu/~rnugent/PCM...
    Beanplot: www.stat.cmu.edu/~rnugent/PCM...
    Most plots from Harvard Open Courseware stuff: www.labxchange.org/library/it...
    Patreon (join for exclusive video each month): / acollierastro
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 2.3K

  • @KingBobXVI
    @KingBobXVI 9 หลายเดือนก่อน +2454

    If I ever write a paper, I'm going to not use a violin plot, and I'm going to cite you for why I didn't use a violin plot.

    • @acollierastro
      @acollierastro  9 หลายเดือนก่อน +1167

      Great point. I demand citations every time someone avoids a violin plot in the future.

    • @bearsaroundhere
      @bearsaroundhere 9 หลายเดือนก่อน +75

      ​@@acollierastroif I wasn't going to use one anyways, then should I still cite so that it's obvious why I didn't

    • @ozymantiasVI
      @ozymantiasVI 9 หลายเดือนก่อน +91

      ​@@acollierastroI'll do you one better, when I review a paper and it uses a violin plot I'll ask the authors to replace it with a boxplot + histogram and cite your video

    • @georgelionon9050
      @georgelionon9050 9 หลายเดือนก่อน +39

      Instead of people citing a video, should write a paper I suggest the title: "Violin plots considered harmful" (because their past existence cannot be undone)

    • @xaxfixho
      @xaxfixho 9 หลายเดือนก่อน +3

      Getting passive aggressive, annoying vibes 😮
      Transwomen are women 🙋‍♀️🙋🙋‍♂️

  • @HoneyBadgerLikesYou
    @HoneyBadgerLikesYou 9 หลายเดือนก่อน +1411

    Angela utilizing her platform to indoctrinate the public against evil plots in an intellectual crusade is what I live for

    • @livingroomviewing2987
      @livingroomviewing2987 9 หลายเดือนก่อน +1

      That part.

    • @werawerlnwerlnrlnelr
      @werawerlnwerlnrlnelr 9 หลายเดือนก่อน +48

      the plot thickens

    • @idontwantahandlethough
      @idontwantahandlethough 9 หลายเดือนก่อน +4

      that was quite clever

    • @benedixtify
      @benedixtify 9 หลายเดือนก่อน +5

      Evil plots

    • @zimbu_
      @zimbu_ 9 หลายเดือนก่อน +10

      What do we want?
      Good data visualization!
      When do we want it?
      Now!

  • @TalysAlankil
    @TalysAlankil 8 หลายเดือนก่อน +446

    i spent 27 minutes going "okay is she going to say they look like vulvas" and then felt very validated

    • @ookazi1000
      @ookazi1000 7 หลายเดือนก่อน +34

      Yeah, I was nodding along to the mechanical critiques of the plots, and realized about three-quarters of the way through the first half (I'm a bit slow about these sorts of things cause I'm an asexual cis man who ain't get none and ain't want none neither) that oh yeah, these do kinda look like genitals (and it only clicked cause one of em kinda looked like a penis and that made it click that the rest of em look like vulvas) and was like: Huh, That's weird: I wonder if anyone else noticed that, and if she's gonna mention it at all?

    • @luckyape
      @luckyape 6 หลายเดือนก่อน +9

      spoiler alert

    • @GSBarlev
      @GSBarlev 6 หลายเดือนก่อน +13

      Meanwhile I'm here getting distracted by the ones that look like stingrays. 🤷‍♂️

    • @fburton8
      @fburton8 4 หลายเดือนก่อน +1

      I call them snot plots because to me they look like gloopy boogers such as you sometimes see in kids with colds.

    • @nickcarroll8565
      @nickcarroll8565 3 หลายเดือนก่อน +4

      @@GSBarlevthe shrinks are going to have a field day with you😂

  • @richardurwin4432
    @richardurwin4432 9 หลายเดือนก่อน +584

    Have you noticed that most graph paper uses a relatively faint blue colour? That is because photocopiers (before they became scanners) couldn't see blue well. If you were careful with the contrast, you could draw a diagram or a form on graph or squared paper and the photocopier would come out with only your lines on it. It was a great way to create forms, character sheets for RPG games or, indeed, the sorts of diagrams in an academic paper.
    Regarding the typewriter thing, you could get whiteout on paper strips. You backspaced over the error, poked the whiteout paper behind the ink-ribbon and hit the erroneous letter again. That deleted it and you could over-type again with the correction. The line spacing on typewriters is quantised so you can go up and down by exact lines easily. It's only when you noticed the error after you had taken the paper out that you would have to resort to liquid whiteout and fiddle to get the positioning correct.

    • @mikedavis979
      @mikedavis979 8 หลายเดือนก่อน +16

      In my typing class in high school (wow, am I really that old?), we used that method as well. We also had fancy Selectric-2's that had a built-in white-out strip! But i also remember just rolling the paper up a bit, applying white-out, letting it dry, and rolling it back down by the same amount. No need to take the paper out altogether. This would have been 1987 or 1988. The tail end of the typing class days, before keyboarding became standard, and Apple II's or Macs, etc., became cheap enough to replace typewriters, to teach typing. I mean, we had a few Apple IIs, but they were for "computer class", not "typing class".

    • @richardurwin4432
      @richardurwin4432 8 หลายเดือนก่อน +4

      @@mikedavis979 That was around a decade after I went to school. In my day only the girls did shorthand and typing. I've cursed that fact many times during my IT career. By the time I realised it would be useful to be able to touch-type, I was too fast without to be able to stick to learning.

    • @LathosZan
      @LathosZan 8 หลายเดือนก่อน +8

      I had a typewriter from my dad that had a little like tape strip in it for corrections instead of white out, and you'd hit the backspace key to go back one character spce and the delete lever would push the ribbons up so the correction tape was in line and you'd just smack the key for the offending character a few times until the tape pulled all the ink up, then you turn off delete and keep typing. You could even do neat stuff by combining some inked characters with other characters on the correction tape, like clearing a + across an inked M looked pretty cool, and it made good on-demand icons if you needed them for something.

    • @richardurwin4432
      @richardurwin4432 8 หลายเดือนก่อน +12

      @@LathosZan That was a carbon ribbon instead of an ink ribbon, a plastic film with a layer of carbon on it. The rich people like important executives used those. They produced more professional-looking text because the ink didn't bleed into the paper. But the ribbon could only be used once; when the shape of a letter had been transferred onto the paper, it left a transparent hole in the tape. There was industrial espionage where people stole the secretaries' used typewriter ribbons and read the documents they'd been typing from the ribbon. Carbon ribbons were expensive and had to be regularly replaced. Ink ribbons just reversed themselves each time through and you kept using them until your text looked too faint. If you were enterprising you could even re-ink them yourself, in much the same way that you can re-fill ink cartridges today.

    • @LathosZan
      @LathosZan 8 หลายเดือนก่อน +1

      @@richardurwin4432 Neat!

  • @BillTranmer
    @BillTranmer 9 หลายเดือนก่อน +1951

    I love your videos. It's like if science was a city and you're a tour guide taking us to all the places where people get stabbed.

    • @matthewpower3062
      @matthewpower3062 9 หลายเดือนก่อน +72

      Funniest description ever!

    • @donpietruk1517
      @donpietruk1517 9 หลายเดือนก่อน +47

      Could you make a violin plot outlining the distribution density of those stabbing areas for us please? I'll show myself out now. 😂😂

    • @gravity_mxk5663
      @gravity_mxk5663 9 หลายเดือนก่อน +10

      Omg I’m dead 😂

    • @samwiseshanti
      @samwiseshanti 9 หลายเดือนก่อน +9

      Omg please tell me you didn't come up with that off the cuff, what a perfect description

    • @n20games52
      @n20games52 9 หลายเดือนก่อน +42

      Also known as: Violence Plots.

  • @AdrianBoyko
    @AdrianBoyko 9 หลายเดือนก่อน +683

    Hello! I am the creator of the Turnip Plot, which is similar to a violin plot but rotated around the vertical axis and rendered in 3D. Please cite my paper. Thanks!

    • @LimeyLassen
      @LimeyLassen 9 หลายเดือนก่อน

      Sounds like a lot of accidental buttplugs to me 😅

    • @carpathianhermit7228
      @carpathianhermit7228 9 หลายเดือนก่อน +6

      Why do you need citation

    • @ts4gv
      @ts4gv 9 หลายเดือนก่อน +1

      @@carpathianhermit7228 Turnip

    • @hsm4983
      @hsm4983 9 หลายเดือนก่อน +123

      if u don't cite my beyblade plot paper in your turnip plot paper I'm citing u for plagiarism

    • @samsowden
      @samsowden 9 หลายเดือนก่อน

      hur hur hur looks like b*schrödinger'scat*tt pl*schrödinger'scat*g

  • @blakethomson7901
    @blakethomson7901 9 หลายเดือนก่อน +296

    The kooky patterns make it so that colorblind people can read the histograms. I'm red-green colorblind, and I get really passionate about data visualization, partially because I have trouble with certain color visuals that other people don't struggle with, and partly because data visualization is a very efficient way to communicate info when done correctly, and I also have adhd, so I appreciate their communicative strengths.
    But yeah, all that to say, I really appreciate when kooky patterns are included. It's immediately tells me that the person who made that visual is thinking about how their data will be perceived and wants to communicate it effectively to as many people as possible.

    • @tglittle3166
      @tglittle3166 9 หลายเดือนก่อน +40

      Designing data representation and slides for talks with color blind people and dyslexics in mind is a drum that I always have to beat with my trainees. A lot of the arbitrary complaining that people do about visualization and typeface etc is actually pretty ableist. Fun example is that comic sans is actually one of the easier fonts for people with dyslexia to read.

    • @QuentinWes
      @QuentinWes 8 หลายเดือนก่อน +15

      Interesting that it helps with colorblindness, i always ran into it in school worksheets and tests. They were printed in black and white so needed a way to differentiate between grey and slightly lighter grey, and we all just had pencils to make them ourselves. It became a bit of a competition as to who could come up with the weirdest patterns for their bar charts that were still distinct. Always nice when adaptations for specific things end up being useful for unintended reasons

    • @TheMvlproductionsinc
      @TheMvlproductionsinc 7 หลายเดือนก่อน

      the comic sans thing is a myth look it up. The same as fonts specifically designed for dyslexia like open dyslexic. More important than anything is font size. @@tglittle3166

    • @charlesloeffler333
      @charlesloeffler333 7 หลายเดือนก่อน +12

      Yes, the data visualization field should deliberately popularize plot color selections that don’t penalize the red-green folks. Or, encourage thinking about using line and hatching types that don’t require colors to distinguish

    • @hasch5756
      @hasch5756 7 หลายเดือนก่อน +16

      This is called hatching and there's a whole system behind it. It comes from heraldry and was developed in the 17th century back when there was already significant demand for printed illustrations but printable coloured ink was not yet invented. Basically, you have hatching patterns representing six basic colours; red which you represent with vertical lines, blue which is horizontal lines, yellow (or gold) which is dots, green which is diagonals from top left to bottom right, purple which is diagonals the other way, and black which is a grid. You get pastel tones by replacing the lines with dashes, and you can mix colours by overlaying the hatching of both, for example dashed verticals give you pink and if you interleave that with dots, you get orange

  • @erdvige
    @erdvige 6 หลายเดือนก่อน +50

    As a data analyst, you've made amazing points for not using violin plots--scientifically. But in the business world, violin plots are **pretty** and the vibe is how you get execs to make the decisions in favor of what you want 😂😂😂😂

    • @GiovanniBottaMuteWinter
      @GiovanniBottaMuteWinter 4 หลายเดือนก่อน +2

      But do the execs understand those plots or do they just think they are cute?

    • @daniel6678
      @daniel6678 4 หลายเดือนก่อน +13

      @@GiovanniBottaMuteWinterI think that’s the point they’re making… they’re saying that in the business world, the information doesn’t matter so long as it’s presented in an attractive way.

    • @londonalicante
      @londonalicante 2 หลายเดือนก่อน +4

      @@daniel6678 Executives are only interested in the overview. They will naturally gravitate towards the violin plot for the overview data and assume the technical people will handle the details in the histogram. We make judgements on how relevant info is to us all the time, and pretty is not necessarily better. For example the more visually appealing a letter through my door is, the quicker it will be identified as junkmail and be thrown away.

    • @spuriusbrocoli4701
      @spuriusbrocoli4701 11 วันที่ผ่านมา

      Head on the nail, tbh. As I was learning data science post-BA, I was struck by how differently academics are expected to visualize info vs how you need to present that same data to "general audiences", i.e. rich kids who coasted into "business".

    • @robertaylor9218
      @robertaylor9218 9 ชั่วโมงที่ผ่านมา

      I’m a super lay person. My problem is that it is about the only plot that I’ve seen that doesn’t intuitively communicate anything.

  • @bilbobaggin3
    @bilbobaggin3 9 หลายเดือนก่อน +644

    Librarian here: Physical Journals/books/etc. are often preferred when it comes to preservation and access, since when we buy a physical copy, we won't lose access to it when the publisher decides that we need to pay another $400 per user to access their system, or if a ransomware attack takes out the archive. Obviously, storing all these is impractical for all but the largest library systems, but the number of times someone needs to cite an article in the 2003 summer edition of Phrenology Today means that you can get away with only one copy in an offsite, climate controlled storage space.
    The number of times I've had to do dumb piracy shit because a publisher literally pulled access to a non-downloadable article I was using for a paper *mid semester* was infuriating. Not to mention the time my school literally stopped paying our publisher liaison because they upped their rates by a factor of 10. We had a lot of 3rd/4th generation photocopies floating around that year, lmao

    • @Emilio1985
      @Emilio1985 9 หลายเดือนก่อน +45

      I love love love my campus library staff, at every campus I've been affiliated with. You all rock! And there's something so nice about just walking through the stacks to flip through journals rather than just scrolling and clicking through online lists of volume/issue numbers.

    • @trespaul
      @trespaul 9 หลายเดือนก่อน +29

      god bless Alexandra Elbakyan

    • @AR0ACE
      @AR0ACE 9 หลายเดือนก่อน +41

      Phrenology Today lol

    • @martinnovacek9151
      @martinnovacek9151 9 หลายเดือนก่อน +8

      @@trespaul She's the best. I really miss the old, up-to-date scihub tho :(

    • @obrotherwhereartliam
      @obrotherwhereartliam 9 หลายเดือนก่อน +11

      Public access now!

  • @davichk
    @davichk 9 หลายเดือนก่อน +310

    I used to work at a tight tolerance thin film deposition optics manufacturer. One day my current supervisor visited my workstation unexpectedly and asked, "you haven't been using violin graphs in any of your report generators, have you?"; "Never heard of such a thing. Why?"; "Good. I knew you were a smart one. Just don't. They're useless anyway." ... Later she quietly explained their inappropriateness. Turns out she wasn't just trying to prevent her team from using them either. The owner got top management together and instructed them to purge any current work of their existence. HE was smart.

    • @SuND4a1
      @SuND4a1 9 หลายเดือนก่อน +2

      This story is so wholesome.

    • @SlenderSmurf
      @SlenderSmurf 7 หลายเดือนก่อน +2

      Inappropriateness? Did they mean from a scientific or a social perspective?

  • @brindlebucker4741
    @brindlebucker4741 9 หลายเดือนก่อน +276

    I'm not a scientist. Just a humble welder. I like your videos. I had no idea where you were going with this, but you had me nodding along, thinking, 'I get it. These violin plots are stupidly complicated and are not as efficient as other plots.' I get it! And you know, not being involved with the sciences, I didn't actually care about the plots one way or the other, but I could see how it would annoy an actual scientist/researcher. Then it finally got it where it was going, and I was like, 'Damn! That was masterfully done.' Keep up the great work.

    • @lost4468yt
      @lost4468yt 8 หลายเดือนก่อน +21

      "humble welder" - I like how you just had to validate your stereotype of "how do you know someone's a welder? They'll tell you" on the first line.

    • @teremleonheart3776
      @teremleonheart3776 7 หลายเดือนก่อน +3

      @@lost4468yt Just like vegans and horse girls.

    • @NightsReign
      @NightsReign 7 หลายเดือนก่อน +7

      ​@@teremleonheart3776 When you say "horse girls" are you meaning equestrians, or is this some new horror? 🤔

    • @teremleonheart3776
      @teremleonheart3776 7 หลายเดือนก่อน +7

      @@NightsReignthis comment honestly made me chuckle for a good bit, i mean girls that are super duper into horses, n have posters of em and backpacks with horses on it, that typa horse girl lmao 😂😂😂😂

    • @CrownRock1
      @CrownRock1 5 หลายเดือนก่อน +7

      @@teremleonheart3776 And now I'm laughing about the connection between welders and horse girls that I had never seen before, but can't unsee now.

  • @IntuitiveAndExhaustive
    @IntuitiveAndExhaustive 9 หลายเดือนก่อน +171

    Hello! I like violin plots a lot; Im a data science researcher.
    A violin plot displays not only easily comparable mean and quartile information, but also more granular information about the shape of tge distrobution. This gives violin plots a unique ability to intuitively inform certain dicisions about further analysis, especially when youre exploring new data with numerous distrobutions for the first time.
    Histograms have an issue of sharing the same axis, which, when trying to understand intricacies of distributions, can be difficult to read. Box plots are easy to read but can obscure information, maybe leaving readers to question if the choice of a box plot was appropriate.
    A violin plot allows you to render an easily interpretable plot which lays bare qualitative aapects of the underlying dustribution. This not only allows for easy analysis via the box plot, but also high level qualitative understanding.
    I never, when i read a violin plot, care about the scale of the distribution, but the shape, which i think they do fairly well.
    Of course, when publishing I may or may not use them. I find them incredibly good for visualizing data exploration, and like to use them when explaining datasets moreso than results.
    On the point of smoothing, totally. Thats why ive gravitated towards swarm plots for general qualitative distribution understanding. But, smoothing is an issue within itself, histograms have essentially the same exact issue in terms of bin size.
    Also, worth noting, im colorblind af, so overlayed color infornation may as well be jibberish to me, which might be part of the reason i hate overlayed histograms so much.

    • @benprytherch9202
      @benprytherch9202 9 หลายเดือนก่อน +30

      I'm not colorblind and I also can't read an overlayed histogram. Way too cluttered, and in my mind I'm trying to imagine what they'd look like not overlayed.

    • @IntuitiveAndExhaustive
      @IntuitiveAndExhaustive 9 หลายเดือนก่อน +6

      I promise I'm not as stupid as my spelling suggests.

    • @TheManifoldTruth
      @TheManifoldTruth 9 หลายเดือนก่อน +31

      Honest question, but why not just use staggered histograms then? What does the rotation and mirroring add? Including information on medians/averages (and quartiles if you really need to, but if you have the histogram there anyway why would you) could be done in pretty much any format you choose.

    • @IntuitiveAndExhaustive
      @IntuitiveAndExhaustive 9 หลายเดือนก่อน +40

      @@TheManifoldTruth That's a great question, and the honest answer is it's really convenient to plot violin plots with seaborn.
      Another more honest answer is the aspect ratio of monitors. While they don't have to, histograms have their density along the Y axis, meaning, if you have a lot of distributions you want to compare, it's easier to fit a violin plot which orient things horizontally. Yes you could just rotate the histogram horizontally, but the love of making the "perfect" vs the "good enough" plot starts to die out around your 10,000th plot in your career.
      Another, maybe more satisfying but less honest answer, is the mean and standard deviation of the distributions is useful in comparison, and that comes out of the box in most violin plots.
      Really, the debate around this feels like the debate around the oxford comma; strong opinions around "rules" which are really well entrenched but still arbitrary preferences.
      I don't have any evidence to back this up, but I wouldn't be surprised if violin plots were more common in more data rich and fast moving research domains like data science rather than physics. In data science, making a plot that's good enough quickly is way more attractive given the sheer volume of visualization required in the domain.
      I have to note though, a lot of my takes make a lot more sense in a business context, rather than an academic context. Papers take a long time to make, so having a sub-par plot makes much less sense.

    • @tglittle3166
      @tglittle3166 9 หลายเดือนก่อน +6

      Came here to say many of the things that you said. Thank you for saying them more completely.

  • @user-xs9wp8ze4m
    @user-xs9wp8ze4m 9 หลายเดือนก่อน +95

    "You can't actually get data, you're just getting vibes." Brilliant.

  • @luna010
    @luna010 9 หลายเดือนก่อน +751

    The technology exists for us to put 3d objects into .pdfs. For this reason, I propose the vase plot.
    The diameter of the vase is the probability density.
    or maybe the cross-sectional area is the probability density.
    It is intentionally ambiguous.
    Please cite my comment whenever you use my case plot in your paper.

    • @jainabraina
      @jainabraina 9 หลายเดือนก่อน +51

      The inner diameter or outer diameter or cross sectional area (selected at random when you run the program, no the module is not seedable) of the vase is the probability density. uploading this to npm and pypi asap

    • @marcellarisa7239
      @marcellarisa7239 9 หลายเดือนก่อน

      Buttplug plot

    • @rojnx9
      @rojnx9 9 หลายเดือนก่อน +75

      I suggest a 4 dimensional plot, called the aerofoil plot, where the aerodynamic drag coefficient of each 3 dimensional cross section of the 4d shape determines the probability density. Also every single 3d cross section is vaguely penis shaped.
      Please cite me

    • @rbr1170
      @rbr1170 9 หลายเดือนก่อน +18

      How about a 4d object casting a 3d shadow? The orientation of a 4d object projects n 3d shadow where n gives the spectrum of possible densities depending on the k-orientation of a complex data mapped onto a 4d-manifold.

    • @vaporisedair4919
      @vaporisedair4919 9 หลายเดือนก่อน +23

      Call it the amphora plot, to get the greek creds

  • @VivekPatel-ze6jy
    @VivekPatel-ze6jy 9 หลายเดือนก่อน +95

    Your story about everyone turning to look at you as the only woman... it gave me a flashback to school sex-ed where the teacher (to be inclusive) said "or a**l sex" and people turned to look at me which basically outed me to the teacher. But also what reaction were they expecting from me lmaoo

    • @GSBarlev
      @GSBarlev 6 หลายเดือนก่อน +9

      This was a great reminder about effective strategies for allyship.
      Any of the other ten guys in the room could have made the response she described, and it wouldn't have even seemed like White Knighting. That they turned to her was definitely motivated by empathy, but the effect is the same as what happened to you.

  • @ealloc
    @ealloc 9 หลายเดือนก่อน +19

    An alternative to a violin plot is a "beeswarm" plot: Instead of a smoothed density you plot each individual datapoint as a dot at its exact y-value, and the x-values of the points are chosen so the dots don't overlap, causing y-coords with a lot of dots to bulge out.
    I like them because you can simultaneously see the raw datapoints, and also see the broad distribution. One problem is that in naive implementations you get chains of points extending out from the center in a line, giving a christmas-tree appearance. But good implementations can avoid this.

  • @jasontracey3416
    @jasontracey3416 9 หลายเดือนก่อน +574

    I'm convinced people only use them because they look vaguely sexual

    • @KillerOfWhales
      @KillerOfWhales 9 หลายเดือนก่อน +37

      A pretty good reason tbh

    • @wtfpwnz0red
      @wtfpwnz0red 9 หลายเดือนก่อน

      I never heard the name until today. I always imagined them as vulvas with misplaced clits and thought when they showed up it was scientists being juvenile somehow

    • @GelidGanef
      @GelidGanef 9 หลายเดือนก่อน +165

      I'm convinced they're only called violin plots because they couldn't get a paper to publish "vagina plot"

    • @Emilio1985
      @Emilio1985 9 หลายเดือนก่อน +37

      And STEM is still largely a boys-club, so there is increased tolerance for anything that even subtly makes non-men uncomfortable.

    • @bulldozer8950
      @bulldozer8950 9 หลายเดือนก่อน +2

      To be fair, if there was a plot that looked like a dick, men would certainly also use that just because it looks like gentiles.

  • @rwgoble
    @rwgoble 9 หลายเดือนก่อน +836

    When I was in undergrad getting my degree in economics, I saw these every once in a while when conducting research for papers. Turns out I wasn't an idiot for not being able to read these plots, I was just an idiot for getting a degree in economics.

    • @Gersberms
      @Gersberms 9 หลายเดือนก่อน +16

      I thought a degree in economics was a guaranteed job at Amazon. Is that still a thing?

    • @therealpbristow
      @therealpbristow 9 หลายเดือนก่อน +67

      @@Gersberms If true, that's the best reason I've heard for *not* getting an economics degree! =:o\

    • @Bozebo
      @Bozebo 9 หลายเดือนก่อน +5

      @@Gersberms Isn't Amazon fundamentally bad for the economy so they're guaranteed to get only hire bad economists? xD

    • @TessHKM
      @TessHKM 9 หลายเดือนก่อน +12

      @@Bozebo it depends on if you view "the economy" as something that's meant to serve small businesses/producers or something that serves consumers.

    • @azlanadil3646
      @azlanadil3646 9 หลายเดือนก่อน

      @@TessHKM I think “the economy” is generally meant to serve everyone. Amazon is obviously not good for producers, and small businesses, but it is also in the long term bad for consumers. Yes it does provide them cheaper goods, but it also results in wealth being funnelled out of communities which kills small towns. It also results in worse working conditions, and lose pay for people in cities. Overall it’s a net negative to the living standard of the average person.

  • @dutubsucks
    @dutubsucks 9 หลายเดือนก่อน +91

    This video made me think back on something you said in your video about imposter syndrome, the part about the 10% who are just fantastic and owns it.
    You talk about how bad violin plots are for 40 minutes while being funny, educational, making scientific and communicative arguments while also bringing in important social and feminist aspects in a clever way.
    Like, who does that? How can someone talk about violin plots for 40 minutes and be this entertaining and make so many good points?
    I am so glad I found this channel!

  • @sageanastasi2028
    @sageanastasi2028 9 หลายเดือนก่อน +40

    Related to that very last point about "why not just make it half the graph", when they made us do violin plots in high school biology we were told to put *half* the data on each side so that the width of the *whole* thing matched the amount of data. Which is absolutely not how anyone else does their violin plots. Also we had to do them by hand, which is as excruciating as it sounds

    • @SlenderSmurf
      @SlenderSmurf 7 หลายเดือนก่อน +5

      I thought that was how they were drawn as well. So that the area of the shape has an actual meaning, which is something intuitive to look at. Although now that I think about it there is no x-axis so doubling or halving all of the widths doesn't change anything.

  • @christopherknight4908
    @christopherknight4908 9 หลายเดือนก่อน +121

    Yes to the symmetry argument. I spent the entire video trying to figure out why they were twice the size they needed to be.

  • @eddieantonio
    @eddieantonio 9 หลายเดือนก่อน +215

    During my masters, my advisor actually ENCOURAGED the use of violin plots, and I didn't really question it at the time. Rude jokes were made. I have a publication that has FIVE SEPARATE PAIRS of violin plots (and plots that cluster points into hexagons? for some reason?). And you're right! Side-by-side histograms would have been BETTER and MORE COMPACT 😭😭😭

  • @Tim3.14
    @Tim3.14 9 หลายเดือนก่อน +35

    While I also find violin plots a bit hard to read, for something like the paper at 13:30 where they apparently want to compare 7 different probability distributions side-by-side, I'm not sure any other option would be much more readable.
    It's probably too many to overlay the probability densities on top of each other (although I agree that's a good option for comparing 2 or 3 distributions). I guess they could do 7 side-by-side histograms or pdfs. 🤷🏻‍♂️
    (By the way, Fig. 3 and 4 you point to aren't actually a histogram of the same thing, they're a bar chart of something else. Note the x-axis isn't numeric, unlike the y-axis of the violin plot. Sorry to nitpick!)

    • @ethanpayne4116
      @ethanpayne4116 7 หลายเดือนก่อน +8

      I made a comment saying basically the same thing, the arguments in this video don't actually make sense in the context of the examples given. Even though I have never personally used violin plots before, I am now convinced that they are a very effective way of visualizing many distributions at once without overlap.

    • @andybrice2711
      @andybrice2711 4 หลายเดือนก่อน +4

      That's what I was thinking. But then the Ridge-Line Plot at 21:20 looks like it's probably superior in every way.

    • @mattc2327
      @mattc2327 4 หลายเดือนก่อน +2

      As a PhD student in Bio, I was also on the way to say this. I have a lot of overlapping distributions for a lot of conditions. I think one solution is to distill your conditions into the truly necessary ones. Then, I think the ridge-line plot (or a less overlapping version of it) is definitely better than a violin plot

  • @FordFourD-aka-Ford4D
    @FordFourD-aka-Ford4D 9 หลายเดือนก่อน +54

    You'd *apply whiteout w/ the page STILL inside* the typewriter, wait about 1 minute, and then use the *backspace* key to shift _back a space_ to your mistake so you can apply new ink over the dried whiteout.
    Later on there were typewriters that could apply the whiteout for you. Usually electric ones. There were some other more esoteric solutions too!
    But yeah, most people just applied something directly to the paper and shifted back a space. Calling it "backspace" on a computer keyboard is one of many holdovers from the typewriter days. So is the stubborn yet incorrect convention of double-spacing after a sentence.
    (Double-spacing is essentially a typewriter trick/convention that makes things easier to read because periods are so small and on some typewriters don't offset enough. Single space after a sentence has ALWAYS been typographically correct in the world of typesetting books - plus print & graphic design.)
    We use a lot of old terms and symbols that don't apply anymore. Like saying "rolling" when a camera starts recording comes from the early days of film when there was a step to roll the film.
    Same way that many save icons are still simplified shapes of floppy disks - lots of kids grow up associating that shape form with "saving" without actually knowing it's a real physical thing.
    Or how how we associate the power symbol with turning things on (it actually was originally a standby-reset symbol or something but that's a whole different conversation.

    • @NoeLPZC
      @NoeLPZC 9 หลายเดือนก่อน +6

      You have a source for that last paragraph? I've always heard the power symbol was a combination of 0 and 1 - a binary toggle for on/off.

    • @MissaBrevis
      @MissaBrevis 8 หลายเดือนก่อน +5

      ​@@NoeLPZCyou're both right - the 0 and 1 do represent binary states, but the version of the power symbol with the line crossing the circle was originally a standby symbol. If I remember correctly it was meant to indicate something like what we'd call sleep mode as opposed to turning something all the way off and on. The actual power-on-off icon was supposed to be the line totally within the circle, not crossing it.
      It's even still used in some specific cases now - I work in a lab and we have vortexers that have marked switch positions for on (line), off (circle) and touch-activated (line breaking circle) modes.

    • @adora_was_taken
      @adora_was_taken 7 หลายเดือนก่อน +2

      actually most correcting typewriters had an adhesive ribbon that would lift the letter off the paper. there's a fun technology connections video about it

    • @mykal4779
      @mykal4779 4 หลายเดือนก่อน

      this concept is called a skeuomorphism

  • @HunchbackJack
    @HunchbackJack 9 หลายเดือนก่อน +379

    I'm an old man, so I know something about typewriting erasure techniques. In rough order of technological advancement:
    1. hand-erasing with an ink eraser
    2. Liquid ink eraser solution, in a bottle. You would apply this to the typo on the page and it would break down the ink and fade it somehow.
    3. An "eraser strip" as part of the ink ribbon where you would retype the offending letter using the strip and it would abrade/absorb/dissolve the ink
    4. hand painting over the typo with white-out (from a bottle)
    5. hand-held whiteout strips, with dried whiteout on one side. You would retype the letter with one hand, holding the strip against the page where the hammer hits with the other hand.
    6. whiteout strip built into the ink ribbon. Same as the whiteout strip, above,, but you don't need to hold the strip, its part of the ink ribbon.
    Most solutions did not require removing the paper, because you can *never* get it aligned again. There's typically enough space where the hammers hit the page for you to get in there with whatever erasing solution you're trying to use.

    • @RealDevastatia
      @RealDevastatia 9 หลายเดือนก่อน +19

      IIRC, the IBM Selectric III would automatically retype the previous character with the whiteout ribbon when you pressed the backspace key.

    • @RealDevastatia
      @RealDevastatia 9 หลายเดือนก่อน +5

      I amazed the spell checker didn't ding me for spelling "whiteout" without a hyphen. It's always "correcting" words that don't need correcting.

    • @artfuldodger5933
      @artfuldodger5933 9 หลายเดือนก่อน +6

      Neat! Thanks for sharing some history!

    • @tlecoyotl
      @tlecoyotl 9 หลายเดือนก่อน +3

      Being a 90's child I still managed to use both mechanical and electric typrewritters. I had forgotten those witheout strips! In my mind, those used to come in little red plastic boxes, kinda like chewing gum packs

    • @delusionnnnn
      @delusionnnnn 9 หลายเดือนก่อน +4

      Those Selectrics with a delete key and a special ribbon were the best! I forget which technology they used - "glueing off" a plastic based ink, or a white-out ribbon. I think they used white-out if I remember correctly, but the few that used a plastic based ink that you could glue off within a few seconds were an absolute delight since you could use that technology on almost any paper and it didn't matter what colour the paper was.

  • @martinnovacek9151
    @martinnovacek9151 9 หลายเดือนก่อน +565

    This channel really feels like having a cool older PhD friend who tells you all the secret tips and tricks and cool stories in academia

    • @tuomasmassa2954
      @tuomasmassa2954 9 หลายเดือนก่อน +4

      Exactly! ❤

    • @castroski7
      @castroski7 9 หลายเดือนก่อน +2

      Its the best

    • @Marc42
      @Marc42 9 หลายเดือนก่อน +2

      Spot-on!

    • @davido2644
      @davido2644 9 หลายเดือนก่อน +2

      Thanks, you really summarised the feeling so well! Love this channel ❤

    • @ubahfly5409
      @ubahfly5409 9 หลายเดือนก่อน +15

      Who you callin "old" , buster ?

  • @Aphidman1
    @Aphidman1 9 หลายเดือนก่อน +2

    You make me want to do video rants about bad research proposals, but then I realize that you are about 1000% better in front of the camera than I could ever be. Keep it up!

  • @nick_eubank
    @nick_eubank 9 หลายเดือนก่อน +29

    One principle of peer review is that we shouldn’t just assume authors are analyzing their data correctly. I appreciate violin plots because they provide the reader reassurance that the use of box plots is appropriate. Absent the density overlay, I worry (and sometimes rightly) that the authors are using box plots in inappropriate contexts (as evidence from the fact one sometimes sees multi mode distributions in violin plots in papers)

    • @mikedavis979
      @mikedavis979 8 หลายเดือนก่อน +3

      I agree, although I do agree with Dr. Collier that violin plots are less aesthetically pleasing. Plotting semi-transparent points over a box plot sometimes can work. Perhaps everyone should make a separate violin plot for reviewers, as well as box plot or something else. Hmmm....

    • @davidjohnston4240
      @davidjohnston4240 8 หลายเดือนก่อน +1

      I prefer to see a test of gaussianness (the actual test name escapes me right now). Then you have a one liner saying "yep, boxplots are good here". No wasted paper.

    • @toastedbread5985
      @toastedbread5985 7 หลายเดือนก่อน

      I think you are referring to a Q-Q line plot? It gives a quick indication if the data is normal and any possible skew at a glance. They are also very useful for comparing goodness of fit between distributions.@@davidjohnston4240

    • @danielhicks1824
      @danielhicks1824 7 หลายเดือนก่อน

      ​@@davidjohnston4240normality lol

    • @mitchellsteindler
      @mitchellsteindler 6 หลายเดือนก่อน +1

      Just use a histogram

  • @obrothernotagain4668
    @obrothernotagain4668 9 หลายเดือนก่อน +79

    I remember vividly how my advisor emphasized how he read papers: title, authors, abstract then plots. If the plots were compelling then he'd dig in. Those plots absolutely need to tell a succinct and coherent story.

    • @halfstep44
      @halfstep44 9 หลายเดือนก่อน +4

      Interesting point. I've always thought of the graphs, plots, whatever they are as being a side dish
      What you said reminds me of my father telling me to always start with the maps in a book of military history, then decide if you want to purchase that book. Similar reason

    • @richardbloemenkamp8532
      @richardbloemenkamp8532 9 หลายเดือนก่อน +3

      Usually the quality of plots it very quick to evaluate. So if you have limited time and a lot of papers it makes sense to judge a bit on plot quality before reading the whole paper. However many plots require some significant explanation of the measurement system and conditions. Therefore after a quick look at the plots I often go back to the text.

  • @user-td3yi1mq7p
    @user-td3yi1mq7p 9 หลายเดือนก่อน +151

    This video sort of gave me the urge to come up with plots that are even more cursed than the violin plot. Like a stick figure plot where different aspects of the data set are represented by the size and orientation of the body parts.

    • @raygivler
      @raygivler 9 หลายเดือนก่อน +6

      It exists. Pie charts.

    • @nerdinleather
      @nerdinleather 9 หลายเดือนก่อน +4

      ​@@raygivlernah this is like you took a pie chart and cursed it

    • @crzyprplmnky
      @crzyprplmnky 9 หลายเดือนก่อน +16

      Can you pull up the latest set of Homunculus plots please? I'm a bit concerned about some outliers I saw 😂

    • @PokeCube_
      @PokeCube_ 9 หลายเดือนก่อน +14

      what about a scatter plot in audio form? you map numerical values to hertz values, and instead of coloring points and lines, you give them an instrument. like a guitar note would be played for each data point, and a violin is played to show the line of best fit. i'd call it a song plot

    • @L3X1N
      @L3X1N 8 หลายเดือนก่อน +2

      @@PokeCube_ Symphony plot?

  • @lilliangoodwin1037
    @lilliangoodwin1037 9 หลายเดือนก่อน +5

    I've been watching your videos for a while as someone who works in humanities but with a great interest in physics and finally I can explain something in turn!
    FWIW, corrections for even manual typewriters are a lot simpler than one might think. On most standard models there is a button off to the side that switches the ribbon (which by Einstein's time would at least have two or more types of ink in a uniform stripe across it, so it looks like one of those weird sour fruit roll ups that's segmented by color horizontally) to the 'correction' ink, which is thick, white and usually on the bottom; most of the time it just shifts the ribbon up like a centimeter or so, but some fancier typewriters can hold separate ribbons just for corrections that you can toggle on and off. There's no need to touch the paper, thankfully.
    After you've switch the ribbon and fixed your mistake, you can switch back to the regular ink seamlessly. You did typically have to move the hammer to where you made the mistake and type that exact letter that you screwed up again in order to cancel it out with a negative image in white ink, but muscle memory makes it like riding a bike.
    Plots were virtually impossible on these things, though, as you said-- it would be like trying to make ASCII art by hand but you can only use a springy lever to move the cursor.

  • @thosewhowish2b693
    @thosewhowish2b693 7 หลายเดือนก่อน +4

    The colors on the graph at 20:37 are really really hard to tell apart for people with deuteranopia (some 6% of males). These pastel colors are hard, it's much better if they are very definitely yellow, or blue, or red, or grey, etc. Just wanted to chip in, since we're talking about it already.

  • @jamesstevenson1766
    @jamesstevenson1766 9 หลายเดือนก่อน +54

    I've always felt vaguely guilty as a scientist for never using the violin plot functions in any plotting tool - thank you for lifting this weight from me.

  • @GeoQuag
    @GeoQuag 9 หลายเดือนก่อน +153

    The comment at the end about “why have two flaps” is the most upsetting part of them to me. The only time I’ve seen something like this (where they used histograms instead of smoothing, so less yonic) that seemed even a little defensible is the man/woman population age plots for different counties. It’s still probably better to arrange them differently but at least they were using the two sides for something.

    • @the_mad_fool
      @the_mad_fool 9 หลายเดือนก่อน +23

      Honestly, those would also just be better if they had both on the same side, as then you can compare them properly....

    • @GeneralTaco155555a
      @GeneralTaco155555a 9 หลายเดือนก่อน +18

      Exactly. What is the point of having your data mirrored onto both sides?
      Your smoothed out histograms were so nice you had to show them twice?
      BS.
      I do see how stacking histograms can get cluttered, but as you pointed out: if the goal is to compare histograms without stacking them, then make an asymmetrical violin chart with labeled axes so you can actually interpret the data.

    • @Appletank8
      @Appletank8 9 หลายเดือนก่อน +12

      The one sorta viable use case that just uses the other side for something useful, but they also don't just smooth it out.
      Ex. population age plot between men and women, Vertical axis is age, horizontal axis is pop count. men on left bars and women on right bars.

    • @cadosian078
      @cadosian078 9 หลายเดือนก่อน +10

      I thought population pyramids were good ways to visualize the data honestly.

    • @Qwicksilver
      @Qwicksilver 9 หลายเดือนก่อน +7

      Saw this on Polymatter and I thought the same thing. That’s the one viable use of this data visualization technique. But there it’s just two histograms turned on their side and placed opposite one another.

  • @Shasha_Mynx
    @Shasha_Mynx 4 หลายเดือนก่อน +1

    I've been recommended this video endlessly and eventually realized I'd subscribed from your other stuff. It's wild to me that this doesn't have more attention. Maybe the algorithm just knows me too well. There's as much to learn from your delivery as there is from the content.

  • @thecynicalone7655
    @thecynicalone7655 8 หลายเดือนก่อน +3

    As to the violin plot joke thing social difficult choices.
    A classic is to look confused and ask why the joke is funny in a very sincere way.
    Another way that tends to work for me is to just say "dude, c'mon", as that puts the onus squarely on them
    I do find it very helpful after a moment like the one you described to reflect on what I could have done differently while keeping my goal in mind. Generally speaking with stuff like this, the best approach is to deflate the other person, so to speak.
    I wish you all the best in your future strange and awkward social interactions

  • @ubahfly5409
    @ubahfly5409 9 หลายเดือนก่อน +123

    A violin plot to overthrow the physics department !

    • @LimeyLassen
      @LimeyLassen 9 หลายเดือนก่อน +25

      There's no need to resort to violins!

    • @offensivebeefroast5407
      @offensivebeefroast5407 9 หลายเดือนก่อน +8

      Let me get the band together

    • @PinataOblongata
      @PinataOblongata 9 หลายเดือนก่อน +5

      Plot twist!

  • @leehurst172
    @leehurst172 9 หลายเดือนก่อน +220

    As a non-scientist, I've always thought these plots were confusing and just obviously above my pay grade. Very validating to hear that they are indeed as uninformative as I thought they were. Much appreciated❤️

    • @thefaboo
      @thefaboo 9 หลายเดือนก่อน +9

      Same! I felt the same about radar plots for a long time until I found out there's no real consesus on how to read those either 🙃

    • @zorinzorinzorin5243
      @zorinzorinzorin5243 9 หลายเดือนก่อน +8

      This is such an important video. I remember that one of my high school textbooks had some stupid plot (that I now understand to be a violin plot) that the author loved to use. That book could have been half-a-pound lighter if they just took them out.

    • @leehurst172
      @leehurst172 9 หลายเดือนก่อน +7

      @@thefaboo yeaahhhhhhhh radar plots are cool until you realize the area can be altered by how the spokes are ordered lol. It's just a multi-variable plot with connections between each percentage for no real reason

    • @Kevin_the_Caveman
      @Kevin_the_Caveman 9 หลายเดือนก่อน +7

      The whole point of data visualisation is to make it easy to understand, otherwise you'd just dump raw data in table format at people, so your POV is perfectly valid. Of course, depending on context, if you are writing something to be read by people familiar with the topic instead of the general public, you can go a bit spicier on the complexity, but it should always be as simple as possible

    • @ShankarSivarajan
      @ShankarSivarajan 9 หลายเดือนก่อน

      On the contrary, they're information dense, combining histograms and box-and-whiskers plots for multiple sets of data. There only uninformative if you decide not to read them.

  • @PatrickPoet
    @PatrickPoet 7 หลายเดือนก่อน +1

    Thanks for putting the last part in

  • @AnnaWillo
    @AnnaWillo 8 หลายเดือนก่อน

    I love your videos so much. I'm starting to think about going back to school and have been considering physics, and your discussions of academia and your work and interests is so inspiring. Thanks for doing what you do!

  • @NeonNijahn
    @NeonNijahn 9 หลายเดือนก่อน +43

    I've never once used a violin plot... but for some reason i still felt like i was in trouble the whole video.

  • @wraithwrecker_
    @wraithwrecker_ 9 หลายเดือนก่อน +77

    I thought, "Well they look funny, but surely there's a reason why they'd be useful!" And then at the 8-minute mark, you finish explaining how you make a violin plot and I'm like, "Okay but why would you do that though???" I think it's a terrible plot already and there's still over 30 minutes of reasons to listen to. Brilliant!

  • @bradbarbour8168
    @bradbarbour8168 4 หลายเดือนก่อน

    What an incredible argument. As a student currently studying business analytics, I find this video extremely valuable. I just found your channel and I have enjoyed every video I have watched. Keep these videos coming!

  • @descentplayer
    @descentplayer 2 หลายเดือนก่อน +1

    I am a guy and never saw a violin plot before this video. I did not become aware that some of them looked like a vulva until you brought it up. I just saw sideways histograms that all had reflections.

  • @punkinholler
    @punkinholler 9 หลายเดือนก่อน +83

    My grad school used to have professional plot makers on staff. It was long before my time but the space they worked in was still there and there were some people still working there who remembered them.

    • @acollierastro
      @acollierastro  9 หลายเดือนก่อน +22

      Professional plot makers! I love that.

    • @robertadsett5273
      @robertadsett5273 9 หลายเดือนก่อน +1

      Pretty much the same for me. There were still a few around but they were on the way out. Images had to be pasted into place

  • @michaelfairchild6768
    @michaelfairchild6768 9 หลายเดือนก่อน +28

    Violin plots are overused but they have a use case for comparisons of a large number of samples that have complex distributions. We use them for this when comparing gene expression in cell populations. We can quickly see the 'shape' and get the vibe of the multimodel gene expression for large sample numbers.

    • @rikwisselink-bijker
      @rikwisselink-bijker 8 หลายเดือนก่อน +4

      Exactly, the point of the violin is just the vibe. The data is in the box plot inside of the violin.

    • @bordeterre5234
      @bordeterre5234 8 หลายเดือนก่อน

      Wouldn’t ridge plots work in that kind of situation ?

  • @2Cool4School6
    @2Cool4School6 9 หลายเดือนก่อน

    Starting my second year in college with the goal of graduating with BA in mechanical engineering. Found your videos accidentally! Love them so far! Thank you for sharing.

  • @rmsgrey
    @rmsgrey 7 หลายเดือนก่อน +3

    There's a strong link between embarrassment and humour, and a lot of embarrassment over taboo topics like anatomy that roughly half of people have (particularly among teenagers and people who haven't got over having been teenagers).
    As for the use of references as a form of comedy, the basic idea is "we all laughed at {thing} then; remembering it will bring you to a similar state of mind and probably make you laugh now". If you didn't laugh as a teenager when someone broke taboo, then you're not going to find it funny when people try to evoke the experience as an adult. On the other hand, if you're someone who found Monty Python hilarious, then someone saying "this parrot - " (the pause is essential) is going to remind you of John Cleese screaming at Michael Palin and very likely get at least a smile, if not a chuckle, out of you.
    There's also a whole in-grouping thing going on - "you and I share an understanding of this reference, therefore I am a member of the in-group and popular and successful"

  • @iesmeh
    @iesmeh 9 หลายเดือนก่อน +58

    I am not a scientist. I have never heard about violin plots before today. But now I know about them and why they are mostly useless. I cannot overstate how much talent you have for making subjects like these interesting. A lot of the time, I pause science videos while alt tabbing to other things, taking in the videos in chunks. I always seem to watch yours straight through, beginning to end. The way you break up your videos with music and title-cards really helps make them digestible. Thank you!

  • @MrBfiguero
    @MrBfiguero 9 หลายเดือนก่อน +23

    A picture is worth a thousand words. Dr. Collier's beleaguered sigh is worth a thousand data points.

  • @CatherineKimport
    @CatherineKimport 7 หลายเดือนก่อน +1

    I clicked on this video thinking "oh what's wrong with those they look cool" but you managed to thoroughly win me over

  • @danielvillalobos7365
    @danielvillalobos7365 9 หลายเดือนก่อน +1

    Thanks to your video I now know how to read a violin plot. I always used to just skip over them, but now that I understand what they mean, I can continue skipping over them without feeling guilty about it.

  • @jainabraina
    @jainabraina 9 หลายเดือนก่อน +16

    I enjoy that this turned into a histogram appreciation video because histograms are really great.

  • @welcomeblack
    @welcomeblack 9 หลายเดือนก่อน +36

    I think the type of smoothing they're doing is called a Kernel Density Estimate. They didn't teach us about KDE plots in physics classes because, as you say, they mostly just show vibes, but it's still better than the arbitrary-window smoothing you're suggesting they do. See the Seaborn documentation for violinplot

  • @user-eo1ju3uu9x
    @user-eo1ju3uu9x 7 หลายเดือนก่อน +2

    About papers 100 years ago. Based on the memoirs of the Stephen Timoshenko it seems that there were special people at universities who prepared plots. You would give them hand scatched drawings and they prepare then versions for a paper.

  • @AndyBuildsStuff
    @AndyBuildsStuff 4 หลายเดือนก่อน

    Great video! Your point about choosing a plot which conveys the most important thing about the data really hits home. Exploring one’s data is so important.

  • @Nossimid
    @Nossimid 9 หลายเดือนก่อน +111

    This is actually kind of funny. I'm a PhD student in statistics, and I learned about violin plots for about 10 seconds in one of my first year courses. Just a few weeks ago I ran into a situation where I actually considered using violin plots to convey the distribution of sequence lengths for a system running in different states. However, I did ultimately decide to use a different plot, because the finished violin plot just looked too weird, and would have been distracting. I admit, I've never seen them used in a professional setting by other statisticians or scientists.

  • @drmodestoesq
    @drmodestoesq 9 หลายเดือนก่อน +25

    In 1956, Bette Nesmith Graham (mother of future Monkees guitarist Michael Nesmith) invented the first correction fluid in her kitchen. Working as a typist, she used to make many mistakes and always strove for a way to correct them. Starting on a basis of tempera paint she mixed with a common kitchen blender, she called the fluid "Mistake Out" and started to provide her co-workers with small bottles on which the brand's name was displayed.

  • @CamilaJosino1
    @CamilaJosino1 9 หลายเดือนก่อน

    THANK YOUUUUU! Just became a subscriber! Good luck for the channel! 🙂

  • @JohannesBee
    @JohannesBee 8 หลายเดือนก่อน +2

    It's like someone took a minimalist stingray and started messing with the aspect ratio

  • @zettabyte323
    @zettabyte323 9 หลายเดือนก่อน +18

    "you can't actually get data, you're just getting vibes" idk why i love this quote

  • @m.f.3347
    @m.f.3347 9 หลายเดือนก่อน +119

    Violin plots are just Georgia O'Keeffe paintings for STEMlords

    • @keenanlarsen1639
      @keenanlarsen1639 9 หลายเดือนก่อน +6

      that is so spot-on 🤣

    • @RobertKnutzen
      @RobertKnutzen 9 หลายเดือนก่อน +9

      i came to this comment section to make this joke and now i feel unoriginal

  • @snoadog
    @snoadog 7 หลายเดือนก่อน

    Thanks for implanting violin music in my head every time I see/visualize violin shaped, non-violins!

  • @TessThisMess
    @TessThisMess 9 หลายเดือนก่อน +1

    You've become my favorite lol. I used to complain to my bf about how anyitme you present data online people always point of "Correlation doesn't equal causation" as if that dismisses all the data or is a complete argument. When you referenced that a few videos back I melted.
    I'm so ready for Schrodinger's cat!

  • @marcins.1128
    @marcins.1128 9 หลายเดือนก่อน +41

    When you made a mistake when typing on a typewriter you had to use a backspace then you would use a special white chalk covered piece of paper - put it between the typing tape and the paper then type the wrong character again - that would "erase" the wrong character, so you could use backspace again and press the correct key.

    • @marcins.1128
      @marcins.1128 9 หลายเดือนก่อน +2

      Now I feel really old.

    • @HunchbackJack
      @HunchbackJack 9 หลายเดือนก่อน

      @@marcins.1128 I'm old enough to remember when those Tipp-Ex strips came out. They were an amazing innovation. Before that, you needed to use an ink eraser, or some noxious kind of solvent that faded the ink on the page. Later, some typewriters had the whiteout strip *built into the ink ribbon*. It was a magical time.

    • @mercury5003
      @mercury5003 9 หลายเดือนก่อน

      @@marcins.1128 Im only 26 and I happened to grow up with one. There was one at my moms old job when shed take me as a kid and I'd play around with it. I'm not sure if the backspace function worked the same way though.

    • @marcins.1128
      @marcins.1128 9 หลายเดือนก่อน

      @@mercury5003 there were also newer typewritters with two tapes - one of them was the erasing one. They store some recent characters in memory so you could use the backspace as on your PC.

    • @jtsiomb
      @jtsiomb 9 หลายเดือนก่อน +1

      Or just cross it out with X-es and type it correctly right next to that or above that. Or use correction fluid, wait for a bit, then type over it (it always looked different though), or re-type the whole page.... depends on what you're doing and what are the tolerances for nice presentation vs just having the text on a piece of paper.

  • @andrewmatas6984
    @andrewmatas6984 9 หลายเดือนก่อน +84

    I have used violin plots and liked them. You have convinced me that I was wrong.

    • @ubahfly5409
      @ubahfly5409 9 หลายเดือนก่อน +15

      Oh really? Where were you Jan 6th !?

    • @aidanjimenez9343
      @aidanjimenez9343 9 หลายเดือนก่อน +3

      @@ubahfly5409what does this mean dawg

    • @thefaboo
      @thefaboo 9 หลายเดือนก่อน +4

      ​@@aidanjimenez9343I think they were (jokingly) calling you a terrorist....

    • @gebali
      @gebali 9 หลายเดือนก่อน +8

      We all make mistakes. But not all of us admit to them publicly

    • @rbr1170
      @rbr1170 9 หลายเดือนก่อน +2

      ​@@gebaligive me a sec and I'll make a violin plot of that.

  • @quote.d
    @quote.d 9 หลายเดือนก่อน +2

    I did a violin plot in one of our articles, and after watching this I can say I did it pretty mindlessly. I was an option. It looked like a cool data viz. It showed an increase in a median altered genome segment length as well as a higher number of longer segment alterations. Which is basically the same thing, as I look at it now. But during writing we were in that phase of trying to get our point across. In that mindset the overloaded graph seemed useful.
    After watching this I'd just do ridgeplots. Thanks for taking your time to talk about this.

  • @CharlesBarouch
    @CharlesBarouch 9 หลายเดือนก่อน +1

    I've been doing editing since back in the days of analog cut and paste -- literally cutting up galley proofs and pasting them on boards to do layout -- I am constantly appreciative of all the steps I no longer have to do in publishing.

  • @oscarfriberg7661
    @oscarfriberg7661 9 หลายเดือนก่อน +163

    The animation on 18:20 could've just been a line diagram. Perfectly conveys the same data with just a single image. Super easy to plot in Excel too. No need to make it a complicated animation that’s impossible to understand.
    I feel like that's often the case with dataisbeautiful. It's almost a competition in presenting the most basic data in the most convoluted way possible. Like those "make the worst volume bar" UI challenges, but serious.

    • @ultimatedude5686
      @ultimatedude5686 9 หลายเดือนก่อน +18

      Since it's supposed to be different levels of legalization I would've gone for a stacked area graph, but I agree with the sentiment. It looks kinda cool but it's definitely worse for conveying information.

    • @dalmationblack
      @dalmationblack 9 หลายเดือนก่อน +22

      one of dataisbeautiful's biggest issues is an obsession with making data animated that doesn't need to be
      it's honestly way easier to tell how quick something is by looking at a slope on a time plot then by trying to compare different speeds half a minute apart in the same animation

    • @antonhelsgaun
      @antonhelsgaun 7 หลายเดือนก่อน +1

      Why do you want a line, or for them to be connected at all? Shouldn't it be 4 separate bars?

    • @oscarfriberg7661
      @oscarfriberg7661 7 หลายเดือนก่อน +3

      @@antonhelsgaun 4 separate lines that demonstrates the change over time. Then you don’t need to make it an animation.
      Or do a stacked area graph like mentioned above.

  • @JoshuaNorton
    @JoshuaNorton 9 หลายเดือนก่อน +12

    Ha, only a minute in and I already adore the video. Out curiousity I had a professor give me old dissertations to see how they used to do data visualisation back in their days. And the solution is glue. Glued in graphs hand drawn on graph paper. Glued in photographs of the setups. And then it clicked in my head why we learned to do all that glueing stuff on paper in elementary school.

  • @mattwatson3407
    @mattwatson3407 8 หลายเดือนก่อน +2

    Wow, actually sharing your experience at work was really enlightening. I never would have thought about how distracting and uncomfortable it would be beyond the off-color comment. The lingering after effect sounds like it had much more of an impact. Thank you for sharing.

  • @Omni-Kriss
    @Omni-Kriss 9 หลายเดือนก่อน

    I really love your videos! Keep it up :)

  • @Marstead
    @Marstead 9 หลายเดือนก่อน +83

    My wife in astrophysics -- she has a very similar experience to what you described with the Violin plots, where she'll go out to dinner with a bunch of male physicists and the waiter will come up and say "Well, ladies first!" And she can't explain how frustrating it is to have the entire table be alerted/reminded of the fact that she's the only woman there. It's tough in that situation because she can't comment on how it makes her uncomfortable because the waiter's not being a bad person about it and it'll make her look bad if she brings it up at the table. So she just kind of has to deal with it. It sucks!

    • @Daniel-ih4zh
      @Daniel-ih4zh 9 หลายเดือนก่อน +1

      Why exactly would this make someone uncomfortable lmao? Bizarre these "diversity and inclusion" people become uncomfortable when they're unique

    • @vickypedia1308
      @vickypedia1308 9 หลายเดือนก่อน +15

      ​​@@Daniel-ih4zhI genuinely hate when people acknowledge I'm a woman and that I'm special because I'm the only woman around at the moment. Why the hell does my gender matter to you guys, I'm here to do my business like everyone else. I didn't "earn" the trait of being a woman so it seems weird to point it out as if it were extraordinary.

    • @Daniel-ih4zh
      @Daniel-ih4zh 9 หลายเดือนก่อน

      @@vickypedia1308 I 100% agree and understand. But it seems like people are having their cake and eating it when they hold this sentiment while also promoting things like WiStem and AA

    • @vickypedia1308
      @vickypedia1308 9 หลายเดือนก่อน +7

      ​@@Daniel-ih4zhI don't know what those terms mean (not a native speaker, so if those are terms where I live they likely have different acronyms). However I would like to add that the people who get uncomfortable when someone highlights that they're "special" for being some sort of minority are usually not the same ones who actively advocate for special treatment. For those who do, it tends to be because they're two different kinds of "special treatment" and one of them feels patronizing while the other doesn't.
      Personally, I think we should strive to decrease sexism at the workplace, not force women quotas or other artificial stuff like that. Sexism is more likely to happen in fields that are predominantly pursued by men, simply due to the lack of women who can point out if someone is being sexist. (And even if there are one or two women there, you don't want to be *that* person who complains about something nobody else sees an issue with.) In my opinion, the fix isn't to forcibly try to get women into that field and making a big deal out of it. I would certainly not want to be the token woman who only got in because a company needed to fill a quota. I think we should rather try to make the place feel welcome to *any* person, women included.

    • @lepannean4231
      @lepannean4231 9 หลายเดือนก่อน +17

      @@Daniel-ih4zh You think it's bizarre when people who want to be included as equal participants, get uncomfortable at being singled out for no reason? It's *almost* like you think "equality for minorities" is the same thing as "special treatment". Hmm. Maybe you should reflect more about that.

  • @ausiidnd
    @ausiidnd 9 หลายเดือนก่อน +48

    Finally! Someone else hates these things!

    • @Amira_Phoenix
      @Amira_Phoenix 9 หลายเดือนก่อน +3

      No science, only vibes 🙄 also, 🐱

  • @ariadne4720
    @ariadne4720 7 หลายเดือนก่อน +2

    I worked as a statistician in the 90s and into the early 00s, and never heard of a violin plot. Knowing what they are now, I see they are entirely useless.

  • @Zero_Gravitas
    @Zero_Gravitas หลายเดือนก่อน

    I have no idea why the algorithm decided I needed to see this but I'm glad it did.

  • @douglasmagowan2709
    @douglasmagowan2709 9 หลายเดือนก่อน +20

    People love "chart art." I used to work in finance. There was pressure to replace data tables with charts when possible, even if the charts ultimately distort the data. But the chart makes the report look pretty.
    A very common chart that I absolutely hate is the 3D pie chart. The pie chart is already a bad chart, but someone has sabotaged what meaning a pie chart has as the areas have become distored and are no longer direct representations of the weights.

    • @OlleLindestad
      @OlleLindestad 9 หลายเดือนก่อน +3

      A 2D pie chart is only bad when there are more than two categories in it. A pie chart with two categories is excellent. You can immediately see whether the fraction displayed is closer to a quarter, half, or three quarters.
      For more than two categories, a stacked bar is better.

    • @davidstrickland3510
      @davidstrickland3510 9 หลายเดือนก่อน +1

      I mean, you can immediately tell whether X% is closer to a quarter, or a half, or whatever without the chart. Charts should ideally be used to get a snapshot of lots of data--not just to make a list of percentages look fun.

  • @moseistrujillo8300
    @moseistrujillo8300 9 หลายเดือนก่อน +175

    There are so many data visualization that just should be wiped from existence. Violin plots are at the top of my personal list, they are outlassed in every way in modern data visualization

    • @BlueSapphyre
      @BlueSapphyre 9 หลายเดือนก่อน +8

      Pie chart/Gauges are the top of my list. But for some reason C-suite loves them.

    • @PBMS123
      @PBMS123 9 หลายเดือนก่อน +12

      @@BlueSapphyre pie charts are fine.

    • @hyphenatednick
      @hyphenatednick 9 หลายเดือนก่อน +5

      If I could murder one type of plot it would be the pie chart. Not because they're worse than violin plots, but because they are more prevalent.

    • @rbr1170
      @rbr1170 9 หลายเดือนก่อน

      ​@@hyphenatednickand often used in the wrong way.

    • @rbr1170
      @rbr1170 9 หลายเดือนก่อน +1

      ​@@PBMS123only if used properly.

  • @yooperkids
    @yooperkids 8 หลายเดือนก่อน +1

    you are one of the funniest people I know on youtube. the bone dry humor always has me cracking up and you do a great job explaining ideas(coming from someone leaning more towards the humanities). bean plot made me laugh so hard. keep it salty.
    edit: got to the 27 min mark and felt slightly disappointed and suprised that you hadn't mentioned how they are vulvas. then noticed the video had 20 min left. lmao

  • @sreekutty3682
    @sreekutty3682 7 หลายเดือนก่อน

    how do i super like this video? loved every second of this, especially the latter half.

  • @bejeweled280
    @bejeweled280 9 หลายเดือนก่อน +20

    I love your videos. Like why would I care why a certain plot is horrendous? 40 plus minutes later I'm super invested and ready to go on the war path about violin plots.

  • @VikcocVyk
    @VikcocVyk 9 หลายเดือนก่อน +59

    This went from ha ha, to not ha ha real fast
    I want to say that you do an amazing job explaining the female side of these interactions
    The fact that you articulate why it was not ok is a great source of information for people who want to do good but don't yet see how certain things are problematic

  • @mrtspence
    @mrtspence 9 หลายเดือนก่อน +11

    I have deep respect for your pivot into the more whimsical hot-takes lately. This is great and hilarious.

  • @lloydy272
    @lloydy272 9 หลายเดือนก่อน

    Thanks for the in-depth discussion. I do like being able to compare many distributions at once (>3) sand I do not like to over lay that many. Especially when there is order (time course or physical proximity). But I’m not a fan of boxplots so I have tried violin plots and the settled on univariant scatter plots with jitter to reduce the overlay of data points on top of each other. In biology this style has become very popular in the last 7-8 years.

  • @FrogVoice
    @FrogVoice 9 หลายเดือนก่อน +71

    When I wrote my bachelor's thesis in uni my supervisor insisted that I would use violin plots to show some data. The problem was, the outliers in my dataset where not many but they were far out, like really really far out. So in the plots the often just weren't visible at all but I had to include them to represent the data accurately, at least that's what I was told. In the end the captions for every figure featuring these plots ended up absurdly long because I had to explain what the hell was going on there lest I forget it myself.
    So I 100% agree with you that these plots are just bad in every regard.
    The bit about these plots looking like genitalia is also true in every regard: because of these outliers some of my medians ended up at the bottom of the plot so that's of course where the belly was located, this however had the fun little side effect of making these particular plots look like a cock and balls. So my supervisor basically insisted that I would draw a bunch of dicks in my thesis. These things are truly terrible, they just always look like genitalia

    • @tanyabils9399
      @tanyabils9399 9 หลายเดือนก่อน +3

      I would argue including the outliers made your visual less accurate not more.

    • @coreycampbell1689
      @coreycampbell1689 9 หลายเดือนก่อน +12

      Maybe we should just call them Rorschach plots

  • @TobiasWeg
    @TobiasWeg 9 หลายเดือนก่อน +21

    I come out and say that I used the Violin Plots in my PHD thesis. I had grain size distributions to display. Histograms have the big problem that you can not well put a lot of them over each other. I had I think ten different samples I wanted to display next to each other (to make them comparable). Furthermore, I wanted to use the median to simplify the further discussion, but I also wanted to show the actual distribution. Of the Particles, as it was important for the behavior I was looking at. The Violin plot was a good combination of:
    1. The median is visually represented.
    2. I do show the actual distribution, so I can discuss the skew, if there is one.
    3. I can pack a lot of them next to each other, the reader gets a good visual representation of the different distributions.
    4. Yes, I find them visually pleasing, if done right. I did plot them horizontal, so.
    5. I think the Violin plot is symmetrical, for the same reason as the Boxplot is symmetrical. And when I think about a metal grain, which I worked with). I thought like it represents the form of the actual grains.
    I did not add another histogram. I did give the smoothing value and normalized the width to one.
    As, I saw the second part of your video, I am sorry to hear about the unfortunate situation and that this plot makes you and other women uncomfortable, this is unfortunate. I did not know about this, and it was absolutely not obvious to me. I am sorry for that.

  • @zeewirszyla
    @zeewirszyla 9 หลายเดือนก่อน +2

    26 minutes in: "They look like ____"
    Me: "THANK YOU! FINALLY!"

  • @loislane5092
    @loislane5092 9 หลายเดือนก่อน +1

    I agree whole heartedly with the second portion of the video (the first too). In situations like the one you described (the not-funny jerk), I will invariably pull out my phone, act like someone's calling, and then interrupt and say loudly "1950's on the phone and wants its joke back". And I'm a male who was born in the late 1950s. I have absolutely *no* time for people like that and *do not* let it pass without comment. Why should I? I'm at an age where I don't have to any more. And coming from a time when "humor" like that was "normal", I actually feel obliged. Damn, you're good.

  • @spacelem
    @spacelem 9 หลายเดือนก่อน +27

    I'm only 8 minutes into the video, so you may well change my mind before the end, but I have made good use of violin plots in my work. When I've been comparing posterior distributions of multiple parameters from multiple different MCMC chains, the violin plots have been an excellent way for me to tell at a glance what the data is doing, and if there are any severe problems.
    Boxplots do not tell you if your posteriors are multimodal, violin plots do, and a histogram with 30 variables is going to be completely unreadable. I don't really care about the precise values of the interquartile ranges, I want to see if chains are converging to the same unimodal distributions. None of this information is for presenting in a paper (I'll give sensible posterior distribution plots there), it's for me (and my collaborators) to understand how well my MCMC is converging, and where the problems are. For that they work pretty well.
    Okay, now I'm going to shut up and continue watching to hear what you have to say!
    EDIT: all my violin plots were horizontal. It never actually occurred to me what they resembled when viewed vertically.... Also I work in veterinary epidemiology (I'm a mathematical modeller), where the majority are women, and my supervisor is a German woman (also did maths at undergraduate), who has no issues with speaking her mind, so I don't think anyone would be as daft as to joke about it!

    • @Ibeechu
      @Ibeechu 9 หลายเดือนก่อน +1

      ja but if your data set is multimodal, why even use a box plot? Or, like, make a histogram since that's showing the important parts and then put the quartiles in a table or something?

    • @TheFartoholic
      @TheFartoholic 9 หลายเดือนก่อน +4

      I agree with this use case - the shinystan package in R makes good use of violin plots and has saved me a lot of time in evaluating models. But I'd argue that the usefulness of violin plots go beyond MCMC. Overlaying density plots / histograms is ideal in most situations, but things get incredibly cluttered the moment you have >4 lines to plot. Having multiple panels of densities works - but is essentially a violin plot without the mirroring.

    • @qu765
      @qu765 9 หลายเดือนก่อน +4

      ridge line plots tho

    • @TheFartoholic
      @TheFartoholic 9 หลายเดือนก่อน +1

      ​@@qu765 Rdgeline plots are good and probably the first choice, but violins are particularly good when you want to compare groups across different strata (e.g., geography and income)

  • @TheBBQify
    @TheBBQify 9 หลายเดือนก่อน +12

    i love watching your videos while i procrastinate on my physics 101 homework. makes me feel like i'm actually working

  • @edp2260
    @edp2260 2 หลายเดือนก่อน +1

    Violin/Bean plots: another example that just because you CAN do something does not mean that you SHOULD.

  • @blake4197
    @blake4197 9 หลายเดือนก่อน

    Holy shit! I love this video! I'm glad I found your channel with this being my first exposure. I had a coworker using a violin plot and I had so many of these complaints but you put it so well!
    (I just complained about how it looks like they wanted it to look like those population charts with men and women but the double sided ness was useless and also that a histogram is just easier to look at because it's sideways lol)

  • @RobFisherUK
    @RobFisherUK 9 หลายเดือนก่อน +12

    I sometimes have a lot of possible plots I could do, and generating all of them as violin plots is useful because I don't know in advance if my data is bimodal or whatever. And I can put lots of violins next to each other and compare them, unlike histograms.
    But sure, I won't put them in my presentations because by then I know the best way to show the data. Fair enough.
    Also excellent point about the smoothing! People not understanding their statistical models bothers me a lot.

  • @theprecipiceofreason
    @theprecipiceofreason 9 หลายเดือนก่อน +9

    Just one more way we seem to have lost the plot.

  • @StressDespot
    @StressDespot หลายเดือนก่อน +1

    16:44 ..HOMESTUCK? *i am immediately shot by the sniper watching my house*

  • @Sakkura1
    @Sakkura1 8 หลายเดือนก่อน +27

    9:09 Histograms are pretty difficult to use for comparison of single-cell RNA sequencing data. You can also make a variation on the violin plot by making each half of the violin represent two different conditions, eg. a control condition vs. some treatment (eg. how does this drug that inhibits this pathway influence expression levels of this other thing, compared to no drug). The ridgeline plot is one alternative, but it's pretty space-inefficient unless you overlap the data and risk harming readability.

    • @ethanpayne4116
      @ethanpayne4116 7 หลายเดือนก่อน +8

      This video was more of a rant than a legitimate analysis of the use cases for the violin-plot. Even the example shown at 10:22 shows just how unreadable overlapping histograms become once you have more than 2. Violin plots are literally just a way of visualizing several histograms at once without making them collide with each other.

  • @alexanderkonczal3908
    @alexanderkonczal3908 9 หลายเดือนก่อน +79

    *me, struggling through learning data analysis*
    time to get my opinions validated about this dumbass plot

    • @alexanderkonczal3908
      @alexanderkonczal3908 9 หลายเดือนก่อน

      ok, I had to stop watching due to being a parent and didn't come back for a long time. whoops. I can safely say ALL my opinions were validated, except that... every time, I think of dangerous butt plugs rather than genitalia, which is even worse, imo?
      regarding why they mirror the curve about the axis, I think 1. some people are unnaturally obsessed with bilateral symmetry, and 2. mirroring the curve makes the changes in the data more dramatic, and this is a plot for the wholly unsubtle.

  • @m.streicher8286
    @m.streicher8286 9 หลายเดือนก่อน +43

    Once, I loudly expressed that a piece of lab equipment looked similar to a "toy" - I still cringe thinking about my own behavior.
    The key difference being that I was an 11 y/o

  • @acebecks6288
    @acebecks6288 9 หลายเดือนก่อน

    Just starting high school and we're learning about different kinds of plots. Really helpful vid lol

  • @jestersudz6085
    @jestersudz6085 9 หลายเดือนก่อน

    omg i love the ridge plot. the first two examples you showed for alternatives were kind of like hard to read at a first glance for me (i am not an academic; i never look at datasets LOL) but that third one was soo nice. showed everything so clearly.

    • @ExtremeExample
      @ExtremeExample 7 หลายเดือนก่อน

      The problem with ridgeline plots is that they obscure data. Important data is hidden behind the plot in front of it. It's absolutely awful at presenting data in a honest way. It's the same issue with the violin plot, people use it because it looks cool, but it's terrible at conveying information.