John Helveston
John Helveston
  • 12
  • 27 192
Cleaning EIA fuel price data in R
I needed data on gasoline prices by US state for a project, so I decided to record my raw data cleaning process from start to finish. Data cleaning is a time consuming process, and I've learned a lot of tricks over the years doing it, so I figured why not just show the process of an example doing it.
Raw fuel price data from EIA here: www.eia.gov/dnav/pet/pet_pri_gnd_dcus_nus_m.htm
Time stamps:
2:15 Setting up R file.
2:35 Reading in the first Excel file and quick cleaning.
9:45 Trying to parameterize the file name.
13:10 Making a tribble for matching file names to region names.
20:20 Debugging.
24:00 Realizing just how messy this data is...omg CA why.
32:58 Return from break with new strategy in mind.
51:15 First cut of the "tidy" data done.
55:22 Epilogue: fixing the missing values.
มุมมอง: 237

วีดีโอ

The cbcTools Package: Tools for Designing and Testing Choice-Based Conjoint Surveys in R
มุมมอง 1.9K2 ปีที่แล้ว
My talk for the 2022 Sawtooth Software Conference Conference. The package functionality presented is current as of the publication date of this video, but some things may have since changed with newer updates to the package. {cbcTools} documentation: jhelvy.github.io/cbcTools/ {cbcTools} source code: github.com/jhelvy/cbcTools/ My website: www.jhelvy.com/
Modeling Heterogeneous Preferences
มุมมอง 7662 ปีที่แล้ว
In this lecture, I introduce two ways to include heterogeneity in choice models: including interaction terms, and mixed logit (hierarchical models) The "yogurt" dataset comes as part of the logitr package. Once installed and loaded, the "yogurt" data frame will be available in the environment. Links to Sections: 00:07 - Background on homogeneous random utility models 00:53 - Overview of two typ...
Willingness to Pay (WTP) and Market Simulation
มุมมอง 1.5K2 ปีที่แล้ว
In this lecture, I discuss how we use estimated utility model coefficients to compute willingness to pay (WTP) and simulate market shares for different markets. I also show how to use draws of the model coefficients to incorporate uncertainty into the WTP and market share calculations. 00:14 - Background: Utility model and maximum likelihood estimation 00:49 - Willingness to pay (WTP) 02:31 - M...
Design of Experiments
มุมมอง 7592 ปีที่แล้ว
In this lecture, I discuss the concept of Design of Experiments (DOE) and how experiment design can impact the amount of information available about different model effects. 06:45 - Main effects 07:21 - Interaction effects 08:03 - Full factorial designs, balance, and orthogonality 10:06 - Fractional factorial designs: non-orthogonal 11:53 - Fractional factorial designs: orthogonal 13:12 - Decis...
Uncertainty
มุมมอง 5673 ปีที่แล้ว
In this lecture, I discuss how we quantify the uncertainty around model parameter estimates that result from the maximum likelihood estimation. 00:04 - Background: Utility model and maximum likelihood estimation 01:02 - Relationship between uncertainty & the curvature of the log-likelihood function 03:57 - Uncertainty reporting using standard errors 04:54 - Practice question 1 05:48 - Computing...
Maximum Likelihood Estimation & Optimization
มุมมอง 2.2K3 ปีที่แล้ว
In this lecture, I discuss how to use maximum likelihood estimation to estimate the coefficients in a utility model. I start by explaining what a likelihood function is, and then explain general principles of optimization. Note : At 03:08 (and other places) in the video where I say to compute the log-likelihood, the calculations are incorrect. They should be the sum of log(L), like log(L1) log(...
Willingness to pay estimates from preference & WTP space utility models: the logitr package
มุมมอง 4K3 ปีที่แล้ว
My talk for the 2021 Sawtooth Conference Turbo Choice Modeling Panel. Note that some of the functions and / or arguments in the logitr package have changed with newer updates to the package compared to what was shown in this video. Related links: My website: www.jhelvy.com/ logitr documentation: jhelvy.github.io/logitr/ logitr source code: github.com/jhelvy/logitr
ggxaringan
มุมมอง 6234 ปีที่แล้ว
A short screen recording demonstrating how I use the infinite_moon_reader() function from the xaringan package to achieve continuous integration while creating and customizing a ggplot2 chart. Some of the benefits I've found from this approach include: - Immediate updating of the chart as I edit it. - Using the chunk settings fig.height and fig.width, I can preview what dimension settings I sho...
Modeling Heterogeneous Preferences (old)
มุมมอง 6905 ปีที่แล้ว
In this lecture, I introduce two ways to include heterogeneity in choice models: including interaction terms, and mixed logit (hierarchical models) The "yogurt" dataset comes as part of the logitr package. Once installed and loaded, the "yogurt" data frame will be available in the environment. Links to Sections: 00:07 - Background on homogeneous random utility models 00:53 - Overview of two typ...
Introduction to choice modeling
มุมมอง 8K5 ปีที่แล้ว
In this lecture, I introduce core concepts in choice modeling, including probability and random utility models. For more details about the logit model, see Chapter 3 of Kenneth Train’s book, “Discrete Choice Methods with Simulation”: eml.berkeley.edu/books/choice2nd/Ch03_p34-75.pdf Links to Sections: 00:11 - Review of Probability 03:46 - Practice Questions 1 04:27 - Random Utility Theory 09:07 ...
Adding a fixed choice conjoint question in Qualtrics
มุมมอง 5K5 ปีที่แล้ว
Link to R demo files in video: box.jhelvy.com/docs/emse6035/qualtrics_fixed_choice_demo.zip In this video I show two ways to create a fixed choice question in a Qualtrics conjoint survey: 1) One approach is to copy and paste the html code to create a conjoint question table that is the same for every respondent. This only enables you to put a fixed choice question before or after the main conjo...

ความคิดเห็น

  • @arturocdb
    @arturocdb 3 หลายเดือนก่อน

    Hi, thank you so much for a freat lecture, i really appreciate if you can share a reference book. Thank you!.

  • @investigaonline
    @investigaonline 6 หลายเดือนก่อน

    Great Dr.Helveston. Thanks👏👏👏

  • @investigaonline
    @investigaonline 6 หลายเดือนก่อน

    Great!!! I have learn and understood the basis and the development od discrete choice modelling. A lot of thanks for this work. I will work with cbcTools and logitr in the future ...

  • @mima9416
    @mima9416 8 หลายเดือนก่อน

    Hello, thank you for this package that seems to adress some issues in a very straightforward way. I am trying to use this package for some choice experiment data. However, my data have multiple choice tasks. In other words, each individual will make several choices. I calculated obsID = "ID_task" and not obsID = "ID_resp". I calculated the models using both mlogit and logitr, but with logitr a art from price the rest of the variables is not significant. Therefore, results are completely different. I assume there might be a problem with my assumption. Any idea about how to resolve the issue with multiple chocie tasks? Thanks in advance for your help. Best

    • @JohnHelveston
      @JohnHelveston 8 หลายเดือนก่อน

      obsID is a sequence of repeated numbers that identifies each unique choice observation. See more here: jhelvy.github.io/logitr/articles/data_formatting.html

  • @mylifeisinhishandsamen4167
    @mylifeisinhishandsamen4167 8 หลายเดือนก่อน

    Thank you!

  • @danaqarout5975
    @danaqarout5975 10 หลายเดือนก่อน

    Thanks so much! This was hugely helpful. I did have a question: for some reason, when I ran the power calculation with no_choice=TRUE design, I kept getting an error "UseMethod("se")". I followed the exact same code that you did. When I looked at the data structure, I noticed the columns weren't the attributes, but the levels. I re-ran the code multiple times with no_choice = true /false, and the data structure always changed when the no_choice was true. Am I doing something wrong?

  • @ApatsaSelemaniAP
    @ApatsaSelemaniAP 11 หลายเดือนก่อน

    so how do you create a questionnaire based on the results?

    • @HarpLover
      @HarpLover 13 วันที่ผ่านมา

      I’m trying to figure this out too

  • @thalesprado6371
    @thalesprado6371 ปีที่แล้ว

    This will save me. In pricing we try to run conjoint all the time you just made my work much easier, thank you.

    • @jhelvy
      @jhelvy ปีที่แล้ว

      🙏

  • @yudhapurbawa5806
    @yudhapurbawa5806 ปีที่แล้ว

    hello, how if we wanted to use the ranking system in the choice based design? and after we get our data survey, can you show how to analyze it using R? thank you

  • @Ryandh288
    @Ryandh288 ปีที่แล้ว

    This is so great.

  • @dnvhung85
    @dnvhung85 2 ปีที่แล้ว

    Hi John, I tried to make a conjoint design by a trial of DesignXM Qualtrics. I can not save the conjoint design. There is an error. it occurs because of my trail plan? Thank you for your advice!

  • @VKjkd
    @VKjkd 2 ปีที่แล้ว

    Excellent work

  • @swarnavasarkar4360
    @swarnavasarkar4360 2 ปีที่แล้ว

    Hello! I was trying to use this method to create randomzied attribute profiles for my thesis. When using the cbc_design code, it says that undefined columns selected. Can you tell me where I am going wrong. Thanks !

    • @jhelvy
      @jhelvy 2 ปีที่แล้ว

      Hi, thanks for your comment, and sorry I didn't see it until now. Can you please post an issue on Github with more details? I can help address the issue there: github.com/jhelvy/cbcTools/issues

  • @ahmed007Jaber
    @ahmed007Jaber 2 ปีที่แล้ว

    Thank u for this. Is there a way to spread a long table in xaringan slides? Suppose u would use the iris dataset and need to spread it across slides so that u can print em on paper

    • @JohnHelveston
      @JohnHelveston ปีที่แล้ว

      Not to my knowledge, though I would ask why you would want to do this? You can scroll inside xaringan if you make the table dynamics by using a package like DT or reactable.

    • @ahmed007Jaber
      @ahmed007Jaber ปีที่แล้ว

      @@JohnHelveston hi thank you for the feedback; thinking of using it as a template to print out

  • @Rafael-px1rz
    @Rafael-px1rz 2 ปีที่แล้ว

    This has brightened up my day 🌞. Increase your stats > P r o m o s m!!!

  • @dc33333
    @dc33333 2 ปีที่แล้ว

    Incredible!!! Very useful. Every data scientist/engineer should see his..

  • @ВладКарпов-ц3ю
    @ВладКарпов-ц3ю 3 ปีที่แล้ว

    b18ed6 #von.ong

  • @christiansetzkorn6241
    @christiansetzkorn6241 3 ปีที่แล้ว

    amazing thanks!

  • @maksim0933
    @maksim0933 3 ปีที่แล้ว

    It would be great to see more tutorial on xaringan) thank you 😊

    • @JohnHelveston
      @JohnHelveston 3 ปีที่แล้ว

      Thanks! My example here was just to show how I use the xaringan::inf_mr() function to be able to interactively view the output of a ggplot chart.