Dimensionality reduction with tidymodels for the Billboard Top 100

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 พ.ย. 2024

ความคิดเห็น • 29

  • @517127
    @517127 3 ปีที่แล้ว +9

    I'm sure that your videos make a greather impact in the quality of my work than any paid course

  • @enicay7562
    @enicay7562 2 หลายเดือนก่อน

    Thank you

  • @carvalhoribeiro
    @carvalhoribeiro 2 ปีที่แล้ว

    Awesome. Thank you so much !

  • @marcogelsomini7655
    @marcogelsomini7655 2 ปีที่แล้ว

    16:50 very interesting point!

  • @mochardhikurniawan1009
    @mochardhikurniawan1009 3 ปีที่แล้ว

    That's a great video, thank you for uploading this video :D
    Hope someday, you can make a video about time series using tidymodels with hyperparameter tuning

  • @tighthead03
    @tighthead03 3 ปีที่แล้ว

    This is what I've been looking for thank you very much. Now to figure out how to quantify which approach provides the most accurate model performance. Thanks again.

  • @mkklindhardt
    @mkklindhardt 3 ปีที่แล้ว

    Once again. Thank you Julia!

  • @aallstar414
    @aallstar414 3 ปีที่แล้ว +1

    awesome video as usual!

  • @alextantos658
    @alextantos658 2 ปีที่แล้ว

    Excellent tutorial and demonstration! However, there is one thing I don't really get. Why would one define an outcome variable for PCA? I mean isn't the point of this type of models to reveal hidden dimensions that express the variability of the data without having an outcome variable?

    • @JuliaSilge
      @JuliaSilge  2 ปีที่แล้ว +1

      Oh for sure, it is not necessary to specify an outcome. You can see how to set up a recipe step with no outcome in the examples here: recipes.tidymodels.org/reference/step_pca.html#ref-examples
      The reason I included an outcome here was to show how you could use dimensionality reduction as a preprocessing step before fitting models, like this:
      www.tmwr.org/dimensionality.html#bean-models

    • @alextantos658
      @alextantos658 2 ปีที่แล้ว

      @@JuliaSilge Thank you for the links! Chapter 16 of the book uses the "class" variable as the outcome variable for all four types of PCA that are exemplified and it was not clear to me whether I would need to add an outcome variable while preparing the recipe. You reply here clears that out. In the mean time, I also watched yesterday an earlier video of yours that uses "~." for ignoring the outcome variable, while conducting PCA. Keep up the good work!

  • @janetfigueroa3288
    @janetfigueroa3288 2 ปีที่แล้ว

    Do you need to use tests such as Bartlett's sphericity test and the KMO index (Kaiser-Mayer-Olkin) before doing PCA?

    • @JuliaSilge
      @JuliaSilge  2 ปีที่แล้ว

      I wouldn't say "need to" per se; you can read more here:
      stats.stackexchange.com/questions/92791/why-does-sphericity-diagnosed-by-bartletts-test-mean-a-pca-is-inappropriate

    • @janetfigueroa3288
      @janetfigueroa3288 2 ปีที่แล้ว

      @@JuliaSilge Thanks Julia! I'll take a look. Do you test these assumptions when diving into PCA? Or what approach do you take if that makes sense? So many nuances to these methods/assumptions can definitely bog down some steps.

  • @srdjanobradovic66
    @srdjanobradovic66 3 ปีที่แล้ว

    Superb, thank you.

  • @xaviercasas100
    @xaviercasas100 3 ปีที่แล้ว +1

    Very cool 👍 wish my teachers did videos like this

  • @guberney
    @guberney 3 ปีที่แล้ว

    Thank you. An excellent video. Do you have any suggestions for multidimensionality reduction using tabular data as input?

    • @JuliaSilge
      @JuliaSilge  3 ปีที่แล้ว +1

      Yes, these will all work for tabular data. Unless you mean something I am not understanding?

    • @guberney
      @guberney 3 ปีที่แล้ว

      @@JuliaSilge By definition PCA is a method for quantitative variables. My question is about handle tabular data to apply multidimensional reduction as PCA, PLS or UMAP.

    • @JuliaSilge
      @JuliaSilge  3 ปีที่แล้ว

      @@guberney I'm not sure what you mean here; "tabular data" typically means data arranged in a table form with rows and columns like what I have used here, like what you would find in a CSV or database or an R dataframe. Maybe you mean something else?

    • @guberney
      @guberney 3 ปีที่แล้ว

      @@JuliaSilge I mean qualitative variables, nominal or ordinal. I am sorry for the “tabular data” expression.

    • @JuliaSilge
      @JuliaSilge  3 ปีที่แล้ว +3

      @@guberney Ah gotcha. You can use `step_dummy()` to create dummy/indicator variables for any nominal/factor/string variables, before normalizing:
      recipes.tidymodels.org/reference/step_dummy.html

  • @Stoney-g1o
    @Stoney-g1o 3 ปีที่แล้ว

    I am only able to set the resolution to 360p. Do you still have the original recording at a higher resolution? It is difficult to read the text in the IDE. thak you for the excellent tutorials. I know it is a lot of work.

    • @JuliaSilge
      @JuliaSilge  3 ปีที่แล้ว +2

      Can you try again? I can set it all the way to 1080p.

    • @Stoney-g1o
      @Stoney-g1o 3 ปีที่แล้ว

      @@JuliaSilge Yes, different resolutions up to 1080p are available with the default at 720p