Market Basket Analysis (Association Rule Mining) With Excel

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 พ.ย. 2024

ความคิดเห็น • 86

  • @DaveOnData
    @DaveOnData  3 ปีที่แล้ว +7

    💥 Learning R programming is easy for Excel users! 💥
    📺 th-cam.com/video/MNpsyjSuR20/w-d-xo.html
    👩‍🔬👨‍🔬 Learn to use your basic Excel skills to analyze the business like a Facebook data scientist:
    📺 th-cam.com/video/xIXymabyFIM/w-d-xo.html

  • @notjakegyllenhaal
    @notjakegyllenhaal 2 ปีที่แล้ว +16

    I’m a data analyst who just started working for a grocery store, so you can’t imagine how excited I am to watch this. Thank you thank you!!

    • @timhannafin8054
      @timhannafin8054 2 ปีที่แล้ว

      I'm interviewing with one very soon. And i love Excel. This is very interesting stuff and gives me so much insight, confidence, and further learning focus to develop my skills and career!

    • @nnajiobikachilo6807
      @nnajiobikachilo6807 ปีที่แล้ว

      Can i talk with you in private...I need your help with sth

    • @notjakegyllenhaal
      @notjakegyllenhaal ปีที่แล้ว

      @@nnajiobikachilo6807 no

  • @jtwest8
    @jtwest8 ปีที่แล้ว +6

    This is incredible and I don't know why we didn't learn this in school (or even in grad school) but all the better to learn from you Dave! Thank you for sharing your knowledge!

  • @darnelb1912
    @darnelb1912 3 ปีที่แล้ว +5

    I love how you solve this type of analysis with Excel cause you can see everything step-by-step. I read and followed an article to do it with R but I didn't know what was happening behind the algorithm and was pretty difficult to understand till now. Thanks Dave!

  • @davidbolduc4378
    @davidbolduc4378 3 ปีที่แล้ว +3

    Perfect timing to find this gem of a channel. Intro to Data mining & Intro to Business Analytics and Big Data finals are right around the corner.

  • @erolbrown
    @erolbrown 2 หลายเดือนก่อน +1

    This is absolute gold. So clearly explained.

    • @DaveOnData
      @DaveOnData  2 หลายเดือนก่อน

      Thank you for the feedback! Glad you enjoyed the video.

  • @ltdata5282
    @ltdata5282 3 ปีที่แล้ว +3

    Thank you for the explanation! I’m planning on using python for my mba, but seeing you explain it using excel really helped me understand the concept in high level way

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      You are welcome and so glad you found the content useful!

  • @yemiakinwande7039
    @yemiakinwande7039 3 ปีที่แล้ว +2

    Very useful for me as a Business Intelligence Analyst. Awesome teaching, thanks David

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      So glad you found the content useful! Be sure to check out some of my other tutorials (e.g., K-Means clustering). 😁

  • @lotteryplayer
    @lotteryplayer ปีที่แล้ว

    Awesome stuff David. I’ve used countifs in excel to associate two different criterias. One main criteria with a table of others. Then I create another column of total frequencies with a final hit percentage. I used a data link to change the main criteria. Using solver I could dig even deeper. Maybe I’ll look at R to get a full count down from best to second best, etc.

  • @kristoferbrown8007
    @kristoferbrown8007 ปีที่แล้ว

    Fantastic content Dave. Very simple, straight forward yet powerful demonstration.

  • @roya8683
    @roya8683 2 หลายเดือนก่อน +1

    Really amazing explanation David beautiful work.

    • @DaveOnData
      @DaveOnData  2 หลายเดือนก่อน

      Wow! Thank you so much for the kind words. Glad to hear you enjoyed the video.

  • @kisholoymukherjee
    @kisholoymukherjee 2 ปีที่แล้ว +1

    David, your content is awesome. Please put up more videos on machine learning.

  • @nealcardoso6528
    @nealcardoso6528 3 ปีที่แล้ว +1

    Great video and explanation of how Market basket analysis works in excel. thank you for posting

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      My pleasure! Glad you found the video useful.

  • @NARFkarriere3
    @NARFkarriere3 3 ปีที่แล้ว +3

    Thanks for a great video! I actually tried your model to analyze some data from my own company, and set it up with 4-ways lift and 5-ways lift. When trying the 5-ways lift I encountered a problem with the Solver, it actually picked the same decision variable twice. In the Titanic Dataset this could for example be "Female", "Firstclass", "Istriplet", "Female" and "Survived" - that is "female" is choocen twice. Do you happen to know how we can make a constraint that ensure unique decision variables? Thanks in advance for any help.

  • @ThepExcel
    @ThepExcel 2 ปีที่แล้ว

    Really Great Example! Thx!!

  • @joncowan2732
    @joncowan2732 2 ปีที่แล้ว +2

    Hey David, Great video! I wanted to point out that you don't mention that the formula for the occurrences needs to be entered as an array and it doesn't show that in the video either. Took me a long time to figure this out and get the formulas to work correctly :/

  • @hardiktyagi1955
    @hardiktyagi1955 2 ปีที่แล้ว +1

    TBH I was a little skeptical to start a 20 min video on this but this was a very efficient way of explaining each and everything. I am really thankful to TH-cam algorithm that I it showed on the top of my list. LIKED, SUBSCRIBED (for all notifications). also hyped to learn R with @david.
    Thanks Man! Happy new year!

  • @nguyenhoangthinh3881
    @nguyenhoangthinh3881 ปีที่แล้ว

    Thank you very much for your Video. It is Awesome. I wonder if you may do a simple video about random forest. Thank you!

  • @eugenioremedi7380
    @eugenioremedi7380 3 ปีที่แล้ว +1

    From Argentina... Awesome, absolutely awesome explanation, congratulations

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Argentina!? Awesome! So glad you found the content useful.

  • @pamidimalla
    @pamidimalla ปีที่แล้ว

    Great explanation. Subscribed and going to watch all your excel videos Dave. Thanks so much for your efforts in making such great videos.

  • @britishrockguy3229
    @britishrockguy3229 6 หลายเดือนก่อน

    David, excellent video, so logical and clear, as with your other videos. I see that you base it on binary values of 0 or 1 but what if the characterstics have multiple values such as a likert based scale or for example such as a person's age?

  • @ashishyadav7380
    @ashishyadav7380 3 ปีที่แล้ว +2

    great video dave! congrats on the quality work.

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Thank you for the feedback, much appreciated! 🙏🙇‍♂️

  • @RichardRebeles
    @RichardRebeles 3 หลายเดือนก่อน +1

    very awesome video great job bro!

    • @DaveOnData
      @DaveOnData  3 หลายเดือนก่อน

      Thank you! I'm glad you enjoyed the video, and I appreciate the kind words.

  • @simonvanleeuwen7877
    @simonvanleeuwen7877 2 ปีที่แล้ว +2

    Hi David, I am wondering if I can use this technique to segment the market for my thesis. In my survey, I have a lot of categorical variables such as the living area, income (in classes), education, etc. The purpose of my thesis is to research if there is a link between these categorical variables and the interest in the product that my thesis company is selling.
    Do you recommend using the market basket analysis for this or are there any other data analysis methods that will be suitable for this topic?
    Ps: Thanks for your very informative videos, I genuinely think your work will help me loads with my thesis.

  • @AMA_AKAR
    @AMA_AKAR 2 ปีที่แล้ว +1

    Many Thanks for a great content!

  • @rassten
    @rassten 3 ปีที่แล้ว +2

    This is so awesome, best video on this tupic

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Thank you for the kind words, very much appreciated! Glad you liked the video.

  • @britesjd
    @britesjd 3 ปีที่แล้ว +4

    Hey, very nice video. I would love to see more videos focused on specific analysis techniques, rather than tools

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +3

      Thank you for the feedback, much appreciated! Could you provide some more context on this?
      Are you referring to video context focused on the techniques/algorithms in the abstract?
      For example, in this video I don't cover the apriori algorithm that is a common market basket analysis implementation.

    • @britesjd
      @britesjd 3 ปีที่แล้ว +2

      @@DaveOnData sure! I was trying to say that the videos that I enjoy the most are the ones that you provide a bit of context on a particular technique/algorithm e do a quick showcase in it`s application. Another example was the "Classification Tree Intuition" video.
      I really liked this Market Basket Analysis video in particular, because your formalized for me a type of analysis that I've done before, but wasn't even aware that it had a name. So it was particularly useful for me, to have a more formal grasp of the technique and be able to google further into it!
      Of course, it is just my opinion. I'll keep a look out for your future content either way! Keep it up! 🖤

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +2

      Awesome, thanks for the clarification!
      I'm thinking you might like my latest video on K-means cluster analysis with Excel: th-cam.com/video/YuxwkchSAW4/w-d-xo.html

  • @brentcos9370
    @brentcos9370 2 ปีที่แล้ว +1

    Outstanding tutorial! You present very well and your method of teaching is excellent. I have used Python for amateur data analysis but I have not used R. Do you see a big difference between the two? Keep in mind I'm not a mathematician or computer science guy. 👍👍

  • @wanderingbutnotlost4934
    @wanderingbutnotlost4934 2 ปีที่แล้ว +1

    I ran the solver and it came back with Female and HasElderly as the best combo, with a lift of 2.53. Did I do something wrong?

  • @olasunkanmibabatope9859
    @olasunkanmibabatope9859 2 ปีที่แล้ว

    Thank you very much for this content

  • @sethforbes2739
    @sethforbes2739 ปีที่แล้ว

    If we only are looking for associations with those who've survived, could we clean up this data set to include only records where survived = 1?

  • @americo_alves
    @americo_alves 3 ปีที่แล้ว +1

    Awesome, David! Do you have something like this excel for conjoint analysis simulation of scenarios?

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +2

      Glad you like the content! At this time I do not have any content on conjoint analysis.

  • @ajaykulkarni576
    @ajaykulkarni576 3 ปีที่แล้ว +1

    David, Great Content!, best explanation. By the way, what are support and confidence?

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Thanks, Ajay! Glad you liked the video.
      In a nutshell:
      Support - The percentage of transactions in the data with a particular itemset. For example, using the Titanic data, the percentage of passengers that were both female and traveling in 1st class.
      Confidence - The probability that a transaction has the items on the left-hand side of the rule and also contains the item on the right-hand side of the rule. Take the following rule using the Titanic data:
      {female, 1st Class} => Survived
      The confidence is the probability of seeing females traveling in first class that survived.

  • @brentcos9370
    @brentcos9370 2 ปีที่แล้ว

    Hi David, I'm looking at my transactional data where each row represents an order and the columns are the products (107 total columns). For each order (row), I have either a 1 (purchased) or 0 (did not purchase) in every cell. I went through your tutorial and it was great. But what I do not understand is that I do not have a RH column. If your example, you had RH as survived 1 or 0. But I do not have such a column as every row is a purchase so yes, every one purchased something. What am I to do?

  • @Marwahassan-l6s
    @Marwahassan-l6s ปีที่แล้ว

    Hi David, great tutorial. Q: how can you use the same concept for associations that are in decimals like 0.8, 0.4, 0, 1? and how can you highlight that significance is above 0.7 for instance? is that doable in excel?

  • @ericfritts5994
    @ericfritts5994 2 ปีที่แล้ว +2

    This video is great but does anyone know how to organize raw data into a workable table? What do you do to organize the data if you have columns with variables in random order. Using the titanic example, what if your passenger data looks like this for thousands of rows:
    Passenger 1| single | survived | third class | adult
    Passenger 2| survived | child | with group | second class

  • @shobindas680
    @shobindas680 3 ปีที่แล้ว +1

    Great video David

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Thank you! Glad you liked the video.

  • @TheMohitosh007
    @TheMohitosh007 3 ปีที่แล้ว +1

    Hey dave, Amazing Content. Do you have the R code for this analysis as well?

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Check out the following blog post on my website:
      💻 www.daveondata.com/blog/titanic-market-basket-analysis-with-r/

  • @PillsLifestyleReviews
    @PillsLifestyleReviews 3 หลายเดือนก่อน

    I Finally mangaed to duplicate this in Power BI. Nearly broke me but I feel like I've got superpowers now :)
    The formula for 3 way lift however is different from the traditional formula seen elsewhere (Number of A,B, C Occurences / (Total transactions * Fraction of A occurrences * Fraction of B occurrences * Fraction of C occurrences)).
    This Formula used here seems to be (Number of A,B, C Occurences / (Fraction of A Occurances * Fraction of B&C Occurances).
    Is there a reason for this? I know this is probably TLDR :D

  • @shikha1311
    @shikha1311 2 ปีที่แล้ว

    Does the table data needs to be in binary form only to conduct MBA? like in thr form of 0 and 1 only?

  • @salahderdar5148
    @salahderdar5148 3 ปีที่แล้ว +1

    Great job, but isn't it easy if we use the correlation feature of data analysis?

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Great question!
      Did you have a specific correlation technique in mind?
      I'm thinking that cosine similarity might be better between the vectors (e.g., moving from binary to "product" counts), but then there's the problem of handling all the vector combinations (e.g., a rule using 4 "products" on the LHS and 1 "product" on the RHS.

    • @salahderdar5148
      @salahderdar5148 3 ปีที่แล้ว +1

      @@DaveOnData
      Yes, you're right
      Thank you David for this clarification

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      My pleasure!

  • @JasonMinhas
    @JasonMinhas 2 ปีที่แล้ว

    In articles on the same topic I see that Lift is calculated using Confidence, how come Confidence is not being applied here?

  • @dmitrijssulickis3014
    @dmitrijssulickis3014 3 ปีที่แล้ว +1

    Good content, liked it

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Thank you for the feedback! Most appreciated. 🙏🙇‍♂️

  • @youdontmatter5085
    @youdontmatter5085 2 ปีที่แล้ว

    WOW your AWESOME!!

  • @midwest042003
    @midwest042003 3 ปีที่แล้ว +1

    Gosh Dave! You are just the best!! I wish you could be my mentor. Thanks so much for this. It would be really nice to learn how to do this with R.
    #TeamR

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Thank you for the compliment and glad you liked the video.
      If you're interested in market basket analysis with R, check out this blog post on my website:
      💻 www.daveondata.com/blog/titanic-market-basket-analysis-with-r/

  • @stanley_wealth1
    @stanley_wealth1 2 ปีที่แล้ว

    I must confess that learning how to use the software is what gets procastinating to start making soft. It scares lol

  • @Nr0991
    @Nr0991 ปีที่แล้ว

    viele danke

  • @mohamed.montaser
    @mohamed.montaser 3 ปีที่แล้ว +1

    why you didn't explain the formulas in this sheet and the concept behind them, for example the 2 way method formulas and table is not like the 3 way method

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +2

      Great question!
      Unfortunately, that was an editorial call because I wanted to keep the video from getting too long.
      That and most folks on TH-cam aren't interested in the math - they just want to get things done.
      I might follow up with a more detailed video on market basket analysis math later.
      If you're interested in the math right now, check out the following blog post:
      towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce

    • @mohamed.montaser
      @mohamed.montaser 3 ปีที่แล้ว +1

      @@DaveOnData
      thanks and please make a video about market basket analysis math.
      you are one of the few content creators in data on TH-cam that can explain quite well with easy example, I started following your posts on LinkedIn then I jumped on your channel here.

    • @DaveOnData
      @DaveOnData  3 ปีที่แล้ว +1

      Copy that, I'll add it to my video backlog! And thank you for the compliment, very much appreciated. 🙏🙇‍♂️

  • @kurodesu404
    @kurodesu404 2 ปีที่แล้ว

    how to transform like titanic dataset but i have dataset like this
    id_transaction products
    1 egg,bread
    2 bread,milk,peanuts
    3 egg,milk, peanuts

  • @teranginofficial
    @teranginofficial ปีที่แล้ว

    Cool

  • @Johnny-lw1yy
    @Johnny-lw1yy ปีที่แล้ว

    Hello Smart people of the internet. Can anyone please explain to me why his frequency data set doesn't add up to 100% ??

    • @kristoferbrown8007
      @kristoferbrown8007 ปีที่แล้ว

      Because multiple features can occur at the same time. Ex: You can be Female, FirstClass and HasChild.

  • @newagedesigntz
    @newagedesigntz 8 หลายเดือนก่อน +1

    I dont have solver

    • @DaveOnData
      @DaveOnData  8 หลายเดือนก่อน

      The Solver is an Add-in that is not enabled by default. Check out the following article from Microsoft:
      support.microsoft.com/en-us/office/load-the-solver-add-in-in-excel-612926fc-d53b-46b4-872c-e24772f078ca

  • @BlueSkyGoldSun
    @BlueSkyGoldSun 2 ปีที่แล้ว

    You encoded your data between 0-1 why ?

  • @avishekkumardas7738
    @avishekkumardas7738 ปีที่แล้ว

    Not enough in detail

  • @bijayadevelop
    @bijayadevelop 2 ปีที่แล้ว

    To those wondering where Solver Parameters is: support.microsoft.com/en-us/office/load-the-solver-add-in-in-excel-612926fc-d53b-46b4-872c-e24772f078ca