SHAP with Python (Code and Explanations)

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ม.ค. 2025

ความคิดเห็น • 99

  • @adataodyssey
    @adataodyssey  11 หลายเดือนก่อน +3

    *NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
    SHAP course: adataodyssey.com/courses/shap-with-python/
    XAI course: adataodyssey.com/courses/xai-with-python/
    Newsletter signup: mailchi.mp/40909011987b/signup

    • @mohadesehkeshavarz9107
      @mohadesehkeshavarz9107 9 หลายเดือนก่อน

      why can not get the XAI for free? the time had ended?

    • @adataodyssey
      @adataodyssey  9 หลายเดือนก่อน

      @@mohadesehkeshavarz9107 if you sign up for the newsletter letter, you will get a coupon that gives you free access to the XAI course. If you are still having trouble, send me your email on Instagram.

  • @pilarangelicarodriguezcaba8199
    @pilarangelicarodriguezcaba8199 11 หลายเดือนก่อน +5

    really easy to understand, a lot better than the offician documentation from shap plots

    • @adataodyssey
      @adataodyssey  11 หลายเดือนก่อน

      Thank you! This was my motivation for the content. Had to do a lot of work to understand the method fully :)

  • @becarivera
    @becarivera 18 วันที่ผ่านมา

    I loved this explanation!! better than many other videos I have watched, thanksss

    • @adataodyssey
      @adataodyssey  18 วันที่ผ่านมา

      I'm glad I could help!

  • @cutestbear3327
    @cutestbear3327 ปีที่แล้ว +2

    thank you for the awesome video~ really like the way you explain everything thoroughly and meticulously. really friendly to people like us who have just begun our journey into data science

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      I'm glad you found it useful! Are there any other related concepts you are interested in learning about?

    • @cutestbear3327
      @cutestbear3327 ปีที่แล้ว +1

      @@adataodyssey hi conor, thnx for your kind reply. i am happy to go with whatever topic you dive into. maybe random forest (and its hyperparameter tuning) since it is such a classic?
      may you have fun and enjoy continued success on youtube~~ cheers

  • @ShotClockHoops
    @ShotClockHoops 10 หลายเดือนก่อน

    This is the best way to explain explanations 😁
    I am interested to see a video of yours with more complex models like Deep Neural Networks on Signal Data and how can we use SHAP on that.
    Great work!

    • @adataodyssey
      @adataodyssey  10 หลายเดือนก่อน

      Thank you! I will keep that in mind

  • @thegerman1239
    @thegerman1239 ปีที่แล้ว

    Thank you so much for this awesome video! I'm currently writing a term paper about this topic and other machine learning explainability techniques. This helped me out a lot while creating my examples!
    Kind regards from Germany!

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      Guten tag! I'm glad this helped. I also have videos about the maths behind Shapley values:
      th-cam.com/video/UJeu29wq7d0/w-d-xo.htmlsi=-s-QTmLoQmSiYwFD
      th-cam.com/video/b9qqbFudVhI/w-d-xo.htmlsi=uMpSUk7ue6Tzs8SQ

    • @thegerman1239
      @thegerman1239 ปีที่แล้ว

      Hey I'm done with the paper! The videos about the math really helped me as well. You're a champ

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      @@thegerman1239 Great stuff! All the best with the result.

  • @yael123gut
    @yael123gut 3 หลายเดือนก่อน

    It was so clearly and well explained, thank you!

    • @adataodyssey
      @adataodyssey  3 หลายเดือนก่อน

      My pleasure :)

  • @tamojitmaiti
    @tamojitmaiti 11 หลายเดือนก่อน

    This is so clear and concise! Thank you!

    • @adataodyssey
      @adataodyssey  11 หลายเดือนก่อน

      No problem Tamojit! This is my goal. More XAI content is on the way.

  • @ShrijaSheth
    @ShrijaSheth ปีที่แล้ว

    I tried XGBoost for a different dataset but it did not give a good scatter plot nor a red line significant to separate the observations. So which other model should one use if the number of features are 870?

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      This is too many features! You will never be able to get good explanations. Try to reduce the amount of features by removing the highly correlated ones.

  • @elenal8494
    @elenal8494 5 หลายเดือนก่อน

    Thank you! your youtube videos are very helpful!

    • @adataodyssey
      @adataodyssey  5 หลายเดือนก่อน

      Glad I could help!

  • @brenoingwersen784
    @brenoingwersen784 4 หลายเดือนก่อน

    For categorical features @3:35 wouldn't it make sense to just create a full pipeline in which all raw features are preprocessed (scaled, encoded, etc) and run through the model to generate predictions and afterwards calculating the shap values? This way you have the categorical feature contribution in an interpretable way...

    • @adataodyssey
      @adataodyssey  4 หลายเดือนก่อน

      @@brenoingwersen784 The problem is if you have a categorical feature with many categories (say 10), you will have 10 dummy features after encoding. This means you will have 10 SHAP values for the categorical feature making it difficult to understand the overall effect of that feature.
      You can solve this by adding the SHAP values for each dummy feature or using catboost.

  • @yukiwang5825
    @yukiwang5825 ปีที่แล้ว +1

    Wonderful video' Thanks for this.

  • @famin7794
    @famin7794 6 หลายเดือนก่อน

    Can't thank you enough. You solve my problem.

    • @adataodyssey
      @adataodyssey  6 หลายเดือนก่อน

      I'm glad I could help!

  • @mahsadehghan-ws1kn
    @mahsadehghan-ws1kn 8 หลายเดือนก่อน

    Thank you so much for this awesome video. When I use this code in the #Train model section, I encounter this error. What is the solution?[17:50:59] C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0b3782d1791676daf-1\xgboost\xgboost-ci-windows\src\data\array_interface.h:492: Unicode-7 is not supported.

    • @adataodyssey
      @adataodyssey  8 หลายเดือนก่อน

      There could be many things going wrong. You can try creating a Python environment and downloading the XGBoost package and only the other ones necessary to train the model.

  • @NasirUddin-im2zb
    @NasirUddin-im2zb ปีที่แล้ว

    When i was running my code i had this issues, regading shap: FutureWarning: In the future `np.long` will be defined as the corresponding NumPy scalar.
    long_ = _make_signed(np.long), I did pip install 1.20.0, 1.24.2, 1.22.2 so on, no of them work, what can i do, if you can suggest me something it will be great.

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Hi Nasir, sorry about that. I've never seen that issue before. To confirm, do you mean that you installed different versions on NumPy?
      This link might help: github.com/neonbjb/tortoise-tts/issues/379
      They suggest trying:
      pip install numpy==1.20.0

  • @murilopalomosebilla2999
    @murilopalomosebilla2999 ปีที่แล้ว

    Really well explained. Thanks ^^

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      No problem! I'm glad you found it useful

  • @slimanearbaoui1237
    @slimanearbaoui1237 ปีที่แล้ว +1

    can this library work with lstm model

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      Hi Slimane :) I've never applied it to an lstm models. Applying SHAP to deep learning models can be challenging. You may be able to apply SHAP to lstm model with some work.
      I have applied it to convolutional neural networks used for image classification and regression tasks. I've linked to two article below. I used the PyTorch. I know that SHAP also works with keras.
      towardsdatascience.com/image-classification-with-pytorch-and-shap-can-you-trust-an-automated-car-4d8d12714eea?sk=b04dcbb8a09f049f605d2110b5c8d851
      towardsdatascience.com/using-shap-to-debug-a-pytorch-image-regression-model-4b562ddef30d?sk=7eb3016839186f1ba2a6f1f105f8ff64

  • @ooplectures3828
    @ooplectures3828 ปีที่แล้ว

    Please explain how can i use shap to determine features important against classes in a multi classification problem. I need to know which features or values of features are contributing to prediction of each class in a multi classification system.

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      This has been on the list for a while. I'm not sure when I'll be able to do it but hopefully soon!

  • @sirireddy3102
    @sirireddy3102 8 หลายเดือนก่อน

    I am getting error near model.fit my data has text and numeric
    So can you help me resolving it

    • @adataodyssey
      @adataodyssey  7 หลายเดือนก่อน

      You will probably need to SHAP text explainer

  • @rafaelagd0
    @rafaelagd0 ปีที่แล้ว +2

    Great video! Could you comment on the future of SHAP? It seems the project was abandoned. The latest commit is from June 2022 and there is a pile of 1.5k issues. I couldn't
    find much information about it and the other packages seem to depend on it. So there may be no alternative.

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +3

      That is a good point, Rafael! I think SHAP has a good future regardless of the package. The method is widely used in industry and is based on solid theory. The method is based on Shapley values which have been around for long time.
      For now the package works well for me. The 1.5k issues is more an indication of the popularity than major issues with the package. Hopefully, if it does run into serious issues then updates will be made. If not, I’m sure something will take it’s place.
      As I mentioned, it is very popular so someone is sure to take advantage of that. The code and method is all open sourced so it shouldn’t be too hard to replicate. I know there are already other implementations in R (see IML package).

  • @adeauy2294
    @adeauy2294 4 หลายเดือนก่อน

    Nice video! the plots will be different for keras model right? i follow your codes but it seems that it wont work for neural network model tho.

    • @adataodyssey
      @adataodyssey  4 หลายเดือนก่อน

      @@adeauy2294 The plots should be the same if you train a NN on tabular data. However, I’ve had a lot of trouble trying to get the package to work with PyTorch. I’m not sure about Keras but I expect you are having similar problems.

  • @possakornkittipipatthanapo1737
    @possakornkittipipatthanapo1737 7 หลายเดือนก่อน

    Hi Shapley value is very amazing in various interpretation and model understanding. However, I didn't see application related to the multi model like visual language model for example CLIP. Could you please provide any explanation or reference to further research?

    • @adataodyssey
      @adataodyssey  7 หลายเดือนก่อน

      Hi I'm not too familiar with this area. I think SHAP is not the best for LLMs or generative models as you are not making predictions.

  • @mayuribhandari2224
    @mayuribhandari2224 4 หลายเดือนก่อน

    I have subscribed to newsletter but not getting access to XAI course

    • @adataodyssey
      @adataodyssey  4 หลายเดือนก่อน

      You should receive a coupn code in your mail. Let me know if you don't get it!

  • @fouedhamouda7356
    @fouedhamouda7356 5 หลายเดือนก่อน

    Thanks,
    can I use Shap with GAN model?

    • @adataodyssey
      @adataodyssey  4 หลายเดือนก่อน

      SHAP is model agnostic so it could be used. However, SHAP can be difficult to implement for neural networks in general. I'm not aware of it being used for GANs.

  • @digitama
    @digitama ปีที่แล้ว

    Your explanation is very interesting, but I met with a problem that is "Numba needs NumPy 1.20 or less" and no matter how much downgrade the Numpy and Numba I did, the problem still doesn't go away, any suggestions?

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Sorry to hear that! Did you try only downgrading the Numpy package? Also you could try upgrading the Numba package instead so it is inline with the latest version of Numpy. Remember to refresh your kernel after installing a new package, if you are working with a notebook.

    • @digitama
      @digitama ปีที่แล้ว

      @@adataodyssey I did downgraded Numba and havent tried upgrading it, what is the version to upgrade to?

  • @anki8136
    @anki8136 ปีที่แล้ว

    Hey connor , Thanks for the course
    I just have one doubt , how to explain this stacked force plot , I am having some problems in that.
    can you make a video or something?

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Hi Anki, I am sorry that the explanation was not clear. Yet, I am reluctant to make a video on the stacked force plot. This is because, in practice, I have not found it very useful. It is used to explore relationships between features and shap values. But you can do this using the dependence plots which are also easier to understand.
      In the course, I go into a bit more detail on the stacked force plot. Did you see that section?

    • @anki8136
      @anki8136 ปีที่แล้ว

      @@adataodyssey no I didn't saw that video yet but I will watch it now

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      @@anki8136 Okay, hopefully that clears things up for you. It is in the aggregations lesson

  • @DarkKnight7_1
    @DarkKnight7_1 ปีที่แล้ว

    Hi Connor, you mentioned on the limitation of the SHAP values that "highly correlated features are a problem when using shap values technique", but on this video the heat map shows that features are highly correlated?

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      The problem with correlated features is that they can potentially lead to unexpected model predictions. That is when we sample pairs of feature values that do not exist in the dataset. Some models will still produce reasonable predictions even if there are correlated features.
      The point is you can still use SHAP even if you have correlated features. You just need to be aware that the results may be negatively impacted. It is important to validate the results using other methods and visualisations. For example, it's not included here, but in the course, we use SHAP interaction values to find an interaction between two features. We then confirm this interaction using a scatter plot. In other words, we had a useful result even with highly correlated features.
      I hope that makes sense!

  • @bakerb-rz6lv
    @bakerb-rz6lv ปีที่แล้ว

    love you, bro.😀

  • @apogounte8239
    @apogounte8239 ปีที่แล้ว

    Hi! Interesting video! Just wanted to mention that if you just run shap.plots.waterfall(shap_values[0]), you never get on the y-axis, the actual names of the features, but you get instead feature 5, feature 2, etc. Is there a quick fix?

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Yes, you should be able to fix that. You can try:
      1) Make sure your X feature matrix (that you pass into the explainer function i.e. shap_values = explainer(X)) is a pandas dataframe and the column names are the correct feature names. You can check these using X.columns
      2) Update the shap_values after they have been created using something like:
      shap_values.feature_names = list(["feature 1","feature 2", ... ]). It is important to pass the new names as a list.
      Let me know if that helps

  • @Gustavo-nn7zc
    @Gustavo-nn7zc 7 หลายเดือนก่อน

    Hi @adataodyssey , great video, thanks! Is there a way to use SHAP for ARIMA/SARIMA?

    • @adataodyssey
      @adataodyssey  7 หลายเดือนก่อน

      Hi Gustavo, it's been a while since I've done time series analysis. If I remember correctly, those models are "interstitially interpretable." This means you can look directly at the parameters in the model to understand how it works and don't need model-agnostic methods like SHAP.
      That being said, you can still apply SHAP to linear models (see the article below). So it may be useful for ARIMA but I haven't seen it applied before.
      medium.com/towards-data-science/8-plots-for-explaining-linear-regression-to-a-layman-489b753da696?sk=ae508ca38771f36045312a27b81ffa75

  • @mulusewwondieyaltaye4937
    @mulusewwondieyaltaye4937 9 หลายเดือนก่อน

    I can't access SHAP python course. Could you please give me the access

    • @adataodyssey
      @adataodyssey  9 หลายเดือนก่อน

      Hi Mulusew, the SHAP course is no longer free. But you will now get free access to my XAI course if you sign up to the newsletter

  • @soniaspisak645
    @soniaspisak645 8 หลายเดือนก่อน

    Hi, I'm struggling with explaining GRU and LSTM models with SHAP. Encouraged by your videos, I am considering buying the course, but does it cover working with 3D data? Is even possible to implement SHAP and obtain reliable plots (without flattening the data) for time-series models?

    • @adataodyssey
      @adataodyssey  8 หลายเดือนก่อน

      Hi Sonia, unfortunately, the course focuses on tabular data and models like XGBoost, Random Forest and CatBoost. There is one lesson on SHAP for image data but it doesn't sound like that will help you much.
      If you are working with PyTorch, these articles might help you get started with applying SHAP:
      towardsdatascience.com/image-classification-with-pytorch-and-shap-can-you-trust-an-automated-car-4d8d12714eea?sk=b04dcbb8a09f049f605d2110b5c8d851
      towardsdatascience.com/using-shap-to-debug-a-pytorch-image-regression-model-4b562ddef30d?sk=7eb3016839186f1ba2a6f1f105f8ff64

  • @shamkhalmammadov4083
    @shamkhalmammadov4083 ปีที่แล้ว +2

    Can you please make another example with categorical variables

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      Hi Shamkhal, there is a video in the course that explains categorical features :) Otherwise, you might find this article useful (no-paywall link): towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19

    • @shamkhalmammadov4083
      @shamkhalmammadov4083 ปีที่แล้ว +1

      @@adataodyssey Thank you very much! I am your big fun. I loved the way you explained SHAP. I got medium 3 days ago just to read your article. I still have a big problem with waterfall plot my targte variable has 3 classes - 0,1,2 for some reason I can not plot faterfall type plot

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      @@shamkhalmammadov4083 Okay, in this case you have a categorical feature as your target variable. I assumed you meant categorical feature as an input feature. I have only worked with binary target variables.
      Can you send me your link to your dataset>

  • @markfedenia3383
    @markfedenia3383 ปีที่แล้ว

    I see that cuML computes Shapley values, however it does not look like the Explainer object is compatible with shap. Do you know if there is any way to use the cuML Explainer object and model with the shap package (by the way, excellent videos)

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Thanks! I'm not too familiar with cuML but I think it should be possible. You would have to replace all SHAP values and base_values in a SHAP explainer object with those from the cuML explainer object.
      It's not exactly what you are looking for but this article explains how you can manipulate the SHAP values object and then use the SHAP plots as normal: towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19

  • @felicebugge
    @felicebugge 9 หลายเดือนก่อน

    Really useful , thank you

    • @adataodyssey
      @adataodyssey  9 หลายเดือนก่อน

      No problem Felice!

  • @smartwork7098
    @smartwork7098 2 หลายเดือนก่อน

    Thank you so much!

  • @KOTESWARARAOMAKKENAPHD
    @KOTESWARARAOMAKKENAPHD ปีที่แล้ว

    I got error in boxplot code

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Sorry to hear that. Can you describe the error in more detail?

  • @bakerb-rz6lv
    @bakerb-rz6lv ปีที่แล้ว +2

    I got something strange bugs. I copy your code, and I run it. At today morning, The code work correctly. But now, it cannot work. I did not change anything!
    The error message is, After I run the code "explainer = shap.Explainer(model)":
    TypeError: The passed model is not callable and cannot be analyzed directly with the given masker! Model: XGBRegressor(base_score=None, booster=None, callbacks=None,
    colsample_bylevel=None, colsample_bynode=None,
    colsample_bytree=None, early_stopping_rounds=None,
    enable_categorical=False, eval_metric=None, feature_types=None,
    gamma=None, gpu_id=None, grow_policy=None, importance_type=None,
    interaction_constraints=None, learning_rate=None, max_bin=None,
    max_cat_threshold=None, max_cat_to_onehot=None,
    max_delta_step=None, max_depth=None, max_leaves=None,
    min_child_weight=None, missing=nan, monotone_constraints=None,
    n_estimators=100, n_jobs=None, num_parallel_tree=None,
    predictor=None, random_state=None, ...)

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Can you try to run this code:
      explainer = shap.Explainer(model,X[0:10])
      where X is the feature matrix used to train your model. For some models, you need to pass this in as a mask. You can see the full example for a random forest here:
      github.com/conorosully/SHAP-tutorial/blob/main/src/project_1_solution.ipynb

    • @bakerb-rz6lv
      @bakerb-rz6lv ปีที่แล้ว

      @@adataodyssey It still cannot work. Strangely, it says "AttributeError: module 'numpy' has no attribute 'bool'". I do not understand why this code is about the numpy. All packages I used is the newest version.

    • @bakerb-rz6lv
      @bakerb-rz6lv ปีที่แล้ว

      @@adataodyssey And I found another difference. In your GitHub code, the step 9--Train model. Your output is
      XGBRegressor(base_score=0.5, booster='gbtree', callbacks=None,
      colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1,
      early_stopping_rounds=None, enable_categorical=False,
      eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',
      importance_type=None, interaction_constraints='',
      learning_rate=0.300000012, max_bin=256, max_cat_to_onehot=4,
      max_delta_step=0, max_depth=6, max_leaves=0, min_child_weight=1,
      missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=0,
      num_parallel_tree=1, predictor='auto', random_state=0, reg_alpha=0,
      reg_lambda=1, ...)
      But my output and your video's output is :
      XGBRegressor(base_score=None, booster=None, callbacks=None,
      colsample_bylevel=None, colsample_bynode=None,
      colsample_bytree=None, early_stopping_rounds=None,
      enable_categorical=False, eval_metric=None, feature_types=None,
      gamma=None, gpu_id=None, grow_policy=None, importance_type=None,
      interaction_constraints=None, learning_rate=None, max_bin=None,
      max_cat_threshold=None, max_cat_to_onehot=None,
      max_delta_step=None, max_depth=None, max_leaves=None,
      min_child_weight=None, missing=nan, monotone_constraints=None,
      n_estimators=100, n_jobs=None, num_parallel_tree=None,
      predictor=None, random_state=None, ...)

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว +1

      @@bakerb-rz6lv Sometimes, if you are using the newest versions, then other packages have not caught up yet. It could be that SHAP uses an older version of numpy. See this similar issue: stackoverflow.com/questions/74893742/how-to-solve-attributeerror-module-numpy-has-no-attribute-bool#:~:text=This%20means%20you%20are%20using,while%20that%20isn't%20fixed.
      The important point is: "Then, in version NumPy 1.24.0, the deprecated np.bool was entirely removed. This means you are using a NumPy version that removed the deprecated ways AND the library you are using wasn't updated to match that version (uses something like np.bool instead of just bool)."
      You could try to install an early version of numpy. But this is just a guess on my part.

    • @bakerb-rz6lv
      @bakerb-rz6lv ปีที่แล้ว

      @@adataodyssey God damn it! You are right. I install numpy==1.22.3 and it work correctly. Maybe you can set this comment to top to notice other freshmen.

  • @wangchris5468
    @wangchris5468 ปีที่แล้ว

    Lovely ~~~~ 👍👍👍

  • @noazamstein5795
    @noazamstein5795 11 หลายเดือนก่อน

    What does it mean that being a male increases the prediction by 0.78, AND ALSO not being an infant FURTHER increases it by 0.42? These two are obviously mutually exclusive, so I would expect either one of them being the sum of 0.78+0.42 or something else

    • @adataodyssey
      @adataodyssey  11 หลายเดือนก่อน

      Your confusion is warranted as there is not a clear interpretation for this feature. In the model, there are three sex features (M, F and I). Together they are mutually exclusive. You are right, by summing up the values you get a clear interpretation of the contribution of the original categorical feature.
      Unfortunately, there is no easy way to do this with the SHAP package. We discuss this is in my SHAP course. You can also find a solution in this article:
      towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19

  • @Irades
    @Irades 5 หลายเดือนก่อน

  • @bakerb-rz6lv
    @bakerb-rz6lv ปีที่แล้ว

    Hello, teacher. I use another method to train my model. Here are some codes:
    from sklearn.model_selection import train_test_split
    # Extract feature and target arrays
    X, y = df.drop('Grade', axis=1), df[['Grade']]
    # Extract text features
    cats = X.select_dtypes(exclude=np.number).columns.tolist()
    # Convert to Pandas category
    for col in cats:
    X[col] = X[col].astype('category')
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)
    dtrain_reg = xgb.DMatrix(X_train, y_train, enable_categorical=True)
    dtest_reg = xgb.DMatrix(X_test, y_test, enable_categorical=True)

    • @bakerb-rz6lv
      @bakerb-rz6lv ปีที่แล้ว

      params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
      n = 100
      model = xgb.train(
      params=params,
      dtrain=dtrain_reg,
      num_boost_round=n,
      )
      explainer = shap.Explainer(model)
      shap_values = explainer(X)

    • @bakerb-rz6lv
      @bakerb-rz6lv ปีที่แล้ว

      And it have something wrong:
      TypeError: The passed model is not callable and cannot be analyzed directly with the given masker! Model:
      How can I fix it?

    • @adataodyssey
      @adataodyssey  ปีที่แล้ว

      Sorry I missed this comment! But I think I answered you question on the other comment :)