Baseball Prediction using Machine Learning - Building the First Model

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 มี.ค. 2023
  • This is the second video in a series where we will attempt to predict the winning probabilities for Major League Baseball games using modern machine learning techniques. In this video we explore the data frame we previously built, and then use gradient boosting to build a model to predict the winning probability of each team based on team-level hitting statistics only.
    Data: www.retrosheet.org
    Notebook: github.com/numeristical/resou...
    Personal links:
    Consulting: www.numeristical.com
    Github: github.com/numeristical

ความคิดเห็น • 31

  • @user-cj6bx8kz5g
    @user-cj6bx8kz5g 4 หลายเดือนก่อน +1

    This is great, excellent content!! What a wonderful way to apply the knowledge I'm learning on my certificates!

    • @numeristical
      @numeristical  4 หลายเดือนก่อน

      Glad you like it! Tell your friends!

  • @josephsinclair2399
    @josephsinclair2399 ปีที่แล้ว +1

    Exactly what I've been looking for, thanks for the content

  • @JPPHandicapping
    @JPPHandicapping ปีที่แล้ว +2

    Really an excellent video. Thank you so much for providing this content for free.

    • @numeristical
      @numeristical  ปีที่แล้ว +1

      Thanks! Tell your friends! :)

  • @anyssalucena6822
    @anyssalucena6822 ปีที่แล้ว +1

    Fantastic ! Great content

    • @numeristical
      @numeristical  ปีที่แล้ว

      Thanks! Be sure to watch the whole series :)

  • @mikemilana2217
    @mikemilana2217 5 หลายเดือนก่อน

    I’ve been looking for something like this thanks for posting!

    • @numeristical
      @numeristical  5 หลายเดือนก่อน

      Glad you like it!

  • @printhelloworld1413
    @printhelloworld1413 2 หลายเดือนก่อน

    Good stuff! Thanks

    • @numeristical
      @numeristical  2 หลายเดือนก่อน

      Glad you liked it!

  • @Andrew-nj4vi
    @Andrew-nj4vi ปีที่แล้ว

    Fantastic video. Quick question though. When running the structure boost model, I am running into the following error:
    TypeError Traceback (most recent call last)
    Cell In[38], line 1
    ----> 1 stb1 = stb.StructureBoost(num_trees=2000,learning_rate=.02,max_depth=3)
    2 stb1.fit(X_train, y_train, eval_set=(X_valid, y_valid), early_stop_past_steps=5)
    File structure_gb.pyx:126, in structureboost.structure_gb.StructureBoost.__init__()
    TypeError: __init__() takes at least 3 positional arguments (2 given)
    Any advice?
    Thanks!

    • @numeristical
      @numeristical  ปีที่แล้ว

      Thanks for the comment! You are right - I made some changed to my local version of StructureBoost and forgot to push them out. I'll fix the notebook now and push out an updated version that works...

    • @numeristical
      @numeristical  ปีที่แล้ว

      As a quick fix, replace that cell with these lines:
      fc = stb.get_basic_config(X_train, stb.default_config_dict())
      stb1 = stb.StructureBoost(max_depth=3, learning_rate=.02, feature_configs = fc, num_trees=2000)
      stb1.fit(X_train, y_train, eval_set=(X_valid, y_valid), early_stop_past_steps=5)
      Let me know if that works. I'll also fix the notebook. Really appreciate you finding this bug!

    • @Andrew-nj4vi
      @Andrew-nj4vi ปีที่แล้ว

      @@numeristical yes that did work for me! Thank you!

  • @jefraz2003
    @jefraz2003 ปีที่แล้ว

    Great videos!! I was wondering do I need a specific version of Visual Studio to run these or will the free version work? Thanks!

    • @numeristical
      @numeristical  ปีที่แล้ว

      Glad you like them! I'm a Mac person so don't know the details about Visual Studio unfortunately. Any way to use Jupyter on Python should work though...

    • @jefraz2003
      @jefraz2003 ปีที่แล้ว

      @@numeristical thank you! I am still having trouble installing mL_insights and structureboost. I keep getting the error: Failed building wheel for splinecalib. Even if I am running Jupyter, I will still need these correct? Thanks for the help!

    • @numeristical
      @numeristical  ปีที่แล้ว

      hmmm.... yeah, those errors have popped up recently. I think it is the newer versions of Python that are causing some trouble. You might try using Python 3.9 and see if that helps (that's the version I am developing on). I plan to remove the SplineCalib dependency from StructureBoost soon, so that should solve the problem for that package at least (once I can push out a change)

    • @jefraz2003
      @jefraz2003 ปีที่แล้ว

      @@numeristical thanks for the reply! Tried Python 3.9 and still get the errors for both. I will try and wait for the new package.

    • @byagnik
      @byagnik 6 หลายเดือนก่อน

      Did you get any solution? I am still running into the error you mentioned@@jefraz2003

  • @ItsRabb
    @ItsRabb 9 หลายเดือนก่อน

    Would you know why I could be getting this error?
    I have tried just about everything from even downloading the tar files directly and moving into the correct folder to see if that would help. I even downgraded python to exactly 3.9.13
    Cannot figure out why it will not import up to structureboost and ml_insights.
    Import "structureboost" could not be resolved
    Import "ml_insights" could not be resolved
    Import "structureboost" could not be resolved

    • @numeristical
      @numeristical  9 หลายเดือนก่อน +1

      Haven't seen that error myself. Googlng "import could not be resolved" suggests it might have something to do with vscode. You might try readinng some of the posts related to the error to see if they are applicable to your environment

  • @Andrew-nj4vi
    @Andrew-nj4vi ปีที่แล้ว

    Hello again,
    I am running into another error when running:
    preds_stb = stb1.predict(X_test)
    ValueError Traceback (most recent call last)
    Cell In[23], line 1
    ----> 1 preds_stb = stb1.predict(X_test)
    File structure_gb.pyx:516, in structureboost.structure_gb.StructureBoost.predict()
    File structure_gb.pyx:587, in structureboost.structure_gb.StructureBoost._predict_fast()
    File structure_gb.pyx:788, in structureboost.structure_gb.predict_with_tensor_c()
    ValueError: Buffer dtype mismatch, expected 'long' but got 'long long'
    Thanks!

    • @numeristical
      @numeristical  ปีที่แล้ว +1

      Hmmm... I've seen issues like this before. Can you file a bug on the github page for structureboost? (If you know how to do that) It would help to know what OS, version of python, etc you are using (I've seen this happen with Windows users - I develop everything on Mac).

    • @numeristical
      @numeristical  ปีที่แล้ว +1

      @Andrew I found the problem - need to make a patch in the structureboost code.
      In the meantime, if you replace `predict` with `_predict_py` you should be able to get that line to run (albeit slower). However, it will likely still fail when you reach the `ice_plot`

    • @numeristical
      @numeristical  ปีที่แล้ว

      @Andrew This (hopefully) should be resolved with the latest version of structureboost. Please upgrade to 0.4.1 (or higher) and let me know if things work now

  • @realsportsbrain
    @realsportsbrain 9 หลายเดือนก่อน

    Hello, great video, was wondering about this since I'm following along, why would I get this error when installing ml-insights ::
    splinecalib/loss_fun_c.c:198:12: fatal error: longintrepr.h: No such file or directory
    198 | #include "longintrepr.h"
    |

    • @numeristical
      @numeristical  9 หลายเดือนก่อน

      This seems to be an incompatibility with Cython that was introduced with Python 3.11. You might try downgrading to Python 3.9 and see if that helps. See this thread for more info:
      github.com/aio-libs/aiohttp/issues/6600