Great video! Many thanks for sharing this wonderful material. I will subscribe to your channel. Greetings from Uruguay, South America! All the best, jc
@@bkrai Hello again Dr Rai, I tried to perform an OLR but the brant test assumption did not hold. Omnibus plus other variable were less than 0.05. What else should I do? is there any alternative test for ordinal dependent variables? Your kind advice will be greatly appreciated.
Hello, Thanks a lot for your video it is very helpful. Could you pelase explain what s the meaning of the confusion matrix error. Also please, how can we compute the R square of our model
For confusion matrix you may refer to: th-cam.com/play/PL34t5iLfZddvv-L5iFFpd_P1jy_7ElWMG.html Also note that when response is a factor variable, we do not use R-square.
Hello Professor. Great lesson. Quick question. I was wondering if we could have used as.ordered(data$Tendency) instead of as.factor. Can you please share some light about this? Thanks a lot in advance
Thanks for making this video its very helpful for us Plz sir can you explain how we get alpha values for categories. is there any formula to calculate tha alpha (@) plz explain it
Sir , very nicely explained. I tried with my data by following your vedio step by step. But one issue. I have a data independent variables are also ordinal in nature . I made into categorical is it correct? which regression you suggest to predict a ordinal variable and independent variable also ordinal.?
what should be the change in Input file , if the independent variables have 3-4 level of ordinal category ? Should the independent variable be marked at 1,2,3,4 and then converted to ordinal factor like you did for NSP ?
You can use ordered() for independent ordinal variable. Some researchers also recommend changing then to numeric variable as it leads to much simpler model.
See if this research paper helps: www.researchgate.net/publication/260273192_Does_Consumer_Behaviour_on_Meat_Consumption_Increase_Obesity_-_Empirical_Evidence_from_European_Countries
Dr, Thanks a lot for your example, but could you help me with a question, Which is the differece between clm and polr, becasue i was traying to use polr in financial rates to stimated rating, but your when i use this waring Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred But if i use clm dont happend that Could you help me to undertand this 2 functions Thanks a lot Regards from Ecuador
Hello thank you for this video, its been super helpful! I have a question regarding the dependent variables. How would you interpret the polr function output for dependent variables that are factors? For example, Tendency (levels: -1,0,1) was used as a dependent variable, how would you interpret each of the coefficients?
Thanks for the video. To calculate probabilities, why did you use alpha-b1x1+.... and not the conventional alpha+b1x1+... It seems different software uses different form of the equations (?) I believe, it its the former in R, perhaps SPSS too.
Hello Sir , Great video. I did not get the way you calculated probability from the t-stat using this formula pnorm(abs(ctable[ ,"t value"]),lower.tail=FALSE)*2 .Could you please explain each term you have used in this formula and why?
Hello Professor, how can I rank the significant variables from an ordinal logit model? I previously performed dominance analysis on the binary logit model but in case of an ordinal logit model that seems inappropriate.
Hello Sir, Thank you so much for this tutorial. I leaned a lot. However, I encountered a problem. When I ran the summary commend, I encountered ..Error in svd(X) : infinite or missing values in 'x'.. message. how to fix this problem.
It estimates model related error. It is lower the better type of metric and helps to assess model quality. It is used for model selection or comparison.
Indeed, its a great video on Ordinal Logistic regression. Thanks professor, I am trying to create a model for my data set. i am facing an issue. When i ran predict command for my training data set, i am getting probability as very small value (summation of the probability is not equal to one). what could be the reason?
Hi Bharatendra, my respond variable is the score of a likert scale from 0 the worst condition to 4 the best. Should I use the function as.order? if yes, I should I keep the 4 as the best condition and the zero as the worst? Thanks
How would you interpret the predicted probabilities from a reference category of a categorical predictor? In other words I’m trying to present the probabilities which I get in my model however I’m confronted with my reference category and hence what would be the best way to derive these? Thanks
Consider my dependent variable is Anaemia status thesis on "mixed effect ordinal logistic regression"1. How can I obtain table on percentage of anaemia status by region in R software? 2. How can I obtain table on prevalence of anaemia status by predictors for anaemia among reproductive age of women in R software? 3. How can I obtain table on Adjusted odds ratio(AOR) and 95%CI of adjusted odds ratios(AOR) for mixed effect ordinal logistic regression in R software?
I'm getting this kind of error do you know what this mean? Warning message: In polr(AccessOnlineRecord ~ ., trainHint, Hess = TRUE) : design appears to be rank-deficient, so dropping some coefs..........
ok.. what kind of ordinal regression you would suggest to a situation where, i have 15 features with 3 features integer, 3, numeric and 8 categorical (binary) and 1 count variable (dependent).. i followed logistic ordinal but not a better result.. i have zero inflated count and tried ZIP model too.. not that great.. ..and cumulative link model(clm) is not fitting as well..kindly suggest
Thank you Bharatendra Rai. I get your explanation and have adapted my work well following the steps shown in your video. I have one issue please. Where columns with independent categorical data having 3 or more levels like the column of "Tendency" shown in your video; the model gives different "Value", "Std. Error", "t value" and "p value" for each level of such variable. This seems challenging and confusing to interpret and write out the equation of the model as some of the p values of the levels may not be significant, which should be removed while the other levels been significant are left. How can such a model be clearly written out and explained? Gracias!
When a independent variable is categorical and takes three values, the correct way to represent it in a regression based model is with the help of 3-1=2 dummy variables. That's what you see here. When Tendency0 & Tendency1 are both zero, then Tendency = -1. When Tendency0 =1 & Tendency1 = 0, then Tendency = 0. When Tendency0 = 0 & Tendency1 = 1, then Tendency = 1. Note that in the equation Tendency0 & Tendency1 can only 0 or 1.
Bharatendra Rai in my case I used dummy variables of 1,2,3 for the three levels my independent categorical data. (Probably I should start with zero?) I converted them to factors. With some independent variables which were continous or categorical and the dependent variable, I ran the model using polr. The output gave me always a coffeficient value for the continous independent variables whereas the categorical ones had different coffeficient for each level. Like with yours Tendency 0 had different coffeficient and p values from Tendency 1 and both were significant. However, when I found the significancy of my data from their p values. I observed that the p value of the various levels differ in some variable (say e.g. edu with levels 1,2,3. R choose level 1 as reference level and so level 3 had value greater than 0.05 while level 2 had p value less than 0.05). I should remove the level 3 too as I remove the non significant variables from the equation I suppose. How can I do so and what may be the following interpretation. Thanks for your kind offer to help.
Hello, thank you for your great video. I have a question. Is AIC important here? Isn't AIC here big for the model since it is larger than 1000 already?
Yes it is high. In the same example when we made a model with three variables, it was over 1700. By adding more variables it came down to about 1038, which is a significant improvement.
When you add more variables the AIC goes down, but then you select variables which have a significant level >0.1 and the AIC goes back up, isn’t it? Wouldn’t you use the model with the lowest AIC, and if not why use the AIC at all? Can I compare models with the AIC as well when in some models variables are log transformed as in others they are not log transformed?
Does anyone know if there is a maximum of independent factors R can handle for this model? I have 6 factors and it gives me an error. However, if I only use 5 of them, no matter which of them, R works perfectly normal
@@bkrai Thanks! But the error I get is: attempt to find suitable starting values failed In addition: Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred
Hi Sir, It's a nice video, I always follow you other videos, they are very good. I am running the ordinal LR on my own data i.e., insurance to find the EMlevel and this dependent variable contains 6 levels i.e., 1,2,3....6. So as per your instructions I converted EMlevel variable to ordered and str is appearing as "EMLevel : Ord.factor w/ 6 levels "1"
Hello sir, I am getting the following error "Error in optim(s0, fmin, gmin, method = "BFGS", ...) : initial value in 'vmmin' is not finite In addition: Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred" can you explain this?
while I was looking for an example project on ordered logit model in R, I came across with this superb video. Thanks a lot, Bharatendra!
Thanks for comments!
Absolute genius. I would pay a million bucks to be your student.
This video really helps alot for my project! Thank you!!!!!
Thanks for the feedback!
Thank you so much!! This video are extremly helpful and clear!!!
You're so welcome!
Really great tutorial. Thank you Sir.
Very helpful video. Thank you very much!
You're welcome!
Thank you so much
for this contribution...congratulations from Peru
Thanks for comments!
Big thanks for your video. It helps a lot
You are welcome!
Great video! Many thanks for sharing this wonderful material. I will subscribe to your channel.
Greetings from Uruguay, South America!
All the best,
jc
Thanks and welcome!
Excellent video. Thank you
You are welcome!
Excellent tutorial!
Thanks!
@@bkrai Hello again Dr Rai, I tried to perform an OLR but the brant test assumption did not hold. Omnibus plus other variable were less than 0.05. What else should I do? is there any alternative test for ordinal dependent variables? Your kind advice will be greatly appreciated.
thanks for the video! very helpful
Thanks for comments!
Perfect, thanks for sharing!
Thanks for comments!
Very good video!
Hello, Thanks a lot for your video it is very helpful. Could you pelase explain what s the meaning of the confusion matrix error. Also please, how can we compute the R square of our model
For confusion matrix you may refer to:
th-cam.com/play/PL34t5iLfZddvv-L5iFFpd_P1jy_7ElWMG.html
Also note that when response is a factor variable, we do not use R-square.
Great Video Dr. Rai, Could you also help for Partial Proportional Odds Model
Thanks, I've added it to my list.
Thank you!
You are welcome!
Sir, This video was helpful. Can you make a video on Brant test for proportional odds assumption?
Will try
Superb!!!!
Thanks!
Hello Professor. Great lesson. Quick question. I was wondering if we could have used as.ordered(data$Tendency) instead of as.factor. Can you please share some light about this? Thanks a lot in advance
Hello and great video!
Would you suggest this model for modelling the results of a football game where the points earned in the end are 0,1 or 3?
Yes, it should work for such data.
Thanks for making this video its very helpful for us
Plz sir can you explain how we get alpha values for categories. is there any formula to calculate tha alpha (@) plz explain it
Sir , very nicely explained. I tried with my data by following your vedio step by step. But one issue. I have a data independent variables are also ordinal in nature . I made into categorical is it correct? which regression you suggest to predict a ordinal variable and independent variable also ordinal.?
The method depends on the dependent variable and not much on the independent variable.
what should be the change in Input file , if the independent variables have 3-4 level of ordinal category ? Should the independent variable be marked at 1,2,3,4 and then converted to ordinal factor like you did for NSP ?
You can use ordered() for independent ordinal variable. Some researchers also recommend changing then to numeric variable as it leads to much simpler model.
good day professor how can I use Ordinal Logistic regression with bmi
See if this research paper helps:
www.researchgate.net/publication/260273192_Does_Consumer_Behaviour_on_Meat_Consumption_Increase_Obesity_-_Empirical_Evidence_from_European_Countries
The video is really helpful.
I am struggling to see the dependent variable's factors outcome combined by or |
Could anyone please explain?
TIA
1 | 2 means level-1 given level-2, and 2 | 3 means level-2 given level-3.
Dr. Bharatendra
Could you please explain how to interpret the outcome of the dependent variable combined with |
For example here is the summary and p-value of my model, I am struggling to interpreter the dependent variable outcome, TIA.
Coefficients:
Value Std. Error t value
H 0.10955 0.06687 1.6381
AGR 0.05929 0.06825 0.8687
NP2 -1.00909 0.30407 -3.3186
NP3 -1.69956 0.40289 -4.2184
NP4 -0.28106 0.44589 -0.6303
Intercepts:
Value Std. Error t value
1|2 -1.1571 0.6301 -1.8363
2|3 -0.0505 0.6090 -0.0829
3|4 0.9036 0.6022 1.5005
4|5 2.2627 0.7164 3.1584
5|6 5.1148 1.5859 3.2253
6|7 16.5213 9.1049 1.8145
Residual Deviance: 631.3888
AIC: 653.3888
Value Std. Error t value p-value
H 0.10954539 0.06687426 1.6380799 0.1014
AGR 0.05928751 0.06825109 0.8686676 0.3850
NP2 -1.00909459 0.30407139 -3.3186107 0.0009
NP3 -1.69956102 0.40288860 -4.2184390 0.0000
NP4 -0.28105858 0.44589078 -0.6303306 0.5285
1|2 -1.15712803 0.63014735 -1.8362817 0.0663
2|3 -0.05048673 0.60902379 -0.0828978 0.9339
3|4 0.90356996 0.60219631 1.5004575 0.1335
4|5 2.26273192 0.71641548 3.1584073 0.0016
5|6 5.11484231 1.58585762 3.2252847 0.0013
6|7 16.52126027 9.10488998 1.8145480 0.0696
Hi Sir... can you please explain the Ordered Probit model for the same data with a tendency with 3 levels as the dependent variable?
Dr, Thanks a lot for your example, but could you help me with a question, Which is the differece between clm and polr, becasue i was traying to use polr in financial rates to stimated rating, but your when i use this waring
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred
But if i use clm dont happend that
Could you help me to undertand this 2 functions
Thanks a lot
Regards from Ecuador
Note that warning messages in R are ok. It's not an error.
Hello thank you for this video, its been super helpful!
I have a question regarding the dependent variables. How would you interpret the polr function output for dependent variables that are factors? For example, Tendency (levels: -1,0,1) was used as a dependent variable, how would you interpret each of the coefficients?
excellent
Thanks for the video. To calculate probabilities, why did you use alpha-b1x1+.... and not the conventional alpha+b1x1+... It seems different software uses different form of the equations (?) I believe, it its the former in R, perhaps SPSS too.
Is there any way to do a ordinal logistics regression for panel Data?
Hello Sir ,
Great video.
I did not get the way you calculated probability from the t-stat using this formula
pnorm(abs(ctable[ ,"t value"]),lower.tail=FALSE)*2 .Could you please explain each term you have used in this formula and why?
Hello Professor, how can I rank the significant variables from an ordinal logit model? I previously performed dominance analysis on the binary logit model but in case of an ordinal logit model that seems inappropriate.
One way could be to use p-value.
is there a way to add nested effects into the model???
Do you have any tutorial for goodness of fit test for ordinal logistic regression?
is goodness for fit test is ap
plied in stata
It already includes test of significance.
Hello Sir, Thank you so much for this tutorial. I leaned a lot. However, I encountered a problem. When I ran the summary commend, I encountered ..Error in svd(X) : infinite or missing values in 'x'.. message. how to fix this problem.
What are your thoughts on AIC?
It estimates model related error. It is lower the better type of metric and helps to assess model quality. It is used for model selection or comparison.
Indeed, its a great video on Ordinal Logistic regression. Thanks professor, I am trying to create a model for my data set. i am facing an issue. When i ran predict command for my training data set, i am getting probability as very small value (summation of the probability is not equal to one). what could be the reason?
Seeing this today. Probably resolved by now.
Hello, could you please tell me how did you get equations for probability, at 9:31/19:21 in above video
It is similar to steps shown in the link below at 4:13,
th-cam.com/video/fDjKa7yWk1U/w-d-xo.html
I have the same question. Only z-statistics' p-value can be calculated by pnorm() while hereby it is t-statistic.
Hi Bharatendra, my respond variable is the score of a likert scale from 0 the worst condition to 4 the best. Should I use the function as.order? if yes, I should I keep the 4 as the best condition and the zero as the worst? Thanks
Yes, that would work fine.
how can i fit a model with ordinal response without proportional odds?
You can try this:
th-cam.com/video/dJclNIN-TPo/w-d-xo.html
thank you :)
Thanks for your comment!
How would you interpret the predicted probabilities from a reference category of a categorical predictor? In other words I’m trying to present the probabilities which I get in my model however I’m confronted with my reference category and hence what would be the best way to derive these? Thanks
Consider my dependent variable is Anaemia status thesis on "mixed effect ordinal logistic regression"1. How can I obtain table on percentage of anaemia status by region in R software?
2. How can I obtain table on prevalence of anaemia status by predictors for anaemia among reproductive age of women in R software?
3. How can I obtain table on Adjusted odds ratio(AOR) and 95%CI of adjusted odds ratios(AOR) for mixed effect ordinal logistic regression in R software?
Very Nice explanation sir. Can you please upload the Cardiotocographic.csv file?
Here is the link: goo.gl/Xc4G7J
Is this approach equal to the CatReg function in SPSS with ranking?
I've not checked it in SPSS. But I guess results should be same.
sir - can u show how to do we interpet abalone data from kaggle or UCI
I saw this today, hope it's taken care of.
how to calculate bias and variance for ordinal.
Sir which model I should use if all the variables both dependent and independent are categorical. Please help me with this
Try Random Forest:
th-cam.com/video/dJclNIN-TPo/w-d-xo.html
@@bkrai thanks again sir, grateful forever
You are welcome!
@@bkrai sir I am getting much error, can I have your mail id, please.
seemabharat@gmail.com
what does it mean to be 'rank defficient'?
Which part of the video are you referring to?
I'm getting this kind of error do you know what this mean?
Warning message:
In polr(AccessOnlineRecord ~ ., trainHint, Hess = TRUE) :
design appears to be rank-deficient, so dropping some coefs..........
It is just a warning message, not an error.
Hi sir. Can u please code support vector learning with ordinal regression
Thanks for the suggestion, I'm adding it to my list for future.
Sir, does Ordinal Regression and Ordinal Logistic Regression are one and the same or are they different?
Ordinal logistic regression is one type of ordinal regression.
ok.. what kind of ordinal regression you would suggest to a situation where, i have 15 features with 3 features integer, 3, numeric and 8 categorical (binary) and 1 count variable (dependent).. i followed logistic ordinal but not a better result.. i have zero inflated count and tried ZIP model too.. not that great.. ..and cumulative link model(clm) is not fitting as well..kindly suggest
what is your response variable?
@@bkrai it's count and also I tried with ranking it .. I have more zeros
Thank you Bharatendra Rai. I get your explanation and have adapted my work well following the steps shown in your video.
I have one issue please. Where columns with independent categorical data having 3 or more levels like the column of "Tendency" shown in your video; the model gives different "Value", "Std. Error", "t value" and "p value" for each level of such variable.
This seems challenging and confusing to interpret and write out the equation of the model as some of the p values of the levels may not be significant, which should be removed while the other levels been significant are left.
How can such a model be clearly written out and explained?
Gracias!
When a independent variable is categorical and takes three values, the correct way to represent it in a regression based model is with the help of 3-1=2 dummy variables. That's what you see here. When Tendency0 & Tendency1 are both zero, then Tendency = -1. When Tendency0 =1 & Tendency1 = 0, then Tendency = 0. When Tendency0 = 0 & Tendency1 = 1, then Tendency = 1. Note that in the equation Tendency0 & Tendency1 can only 0 or 1.
Bharatendra Rai in my case I used dummy variables of 1,2,3 for the three levels my independent categorical data. (Probably I should start with zero?)
I converted them to factors. With some independent variables which were continous or categorical and the dependent variable, I ran the model using polr.
The output gave me always a coffeficient value for the continous independent variables whereas the categorical ones had different coffeficient for each level. Like with yours Tendency 0 had different coffeficient and p values from Tendency 1 and both were significant.
However, when I found the significancy of my data from their p values. I observed that the p value of the various levels differ in some variable (say e.g. edu with levels 1,2,3. R choose level 1 as reference level and so level 3 had value greater than 0.05 while level 2 had p value less than 0.05). I should remove the level 3 too as I remove the non significant variables from the equation I suppose.
How can I do so and what may be the following interpretation.
Thanks for your kind offer to help.
For categorical variables, even if one level is significant, do not drop the variable from the model.
Bharatendra Rai I sincerly appreciate your explanation. It is noted.
+Nasamu Bawa great 👍
Hello, thank you for your great video. I have a question. Is AIC important here? Isn't AIC here big for the model since it is larger than 1000 already?
Yes it is high. In the same example when we made a model with three variables, it was over 1700. By adding more variables it came down to about 1038, which is a significant improvement.
When you add more variables the AIC goes down, but then you select variables which have a significant level >0.1 and the AIC goes back up, isn’t it? Wouldn’t you use the model with the lowest AIC, and if not why use the AIC at all? Can I compare models with the AIC as well when in some models variables are log transformed as in others they are not log transformed?
Does anyone know if there is a maximum of independent factors R can handle for this model? I have 6 factors and it gives me an error. However, if I only use 5 of them, no matter which of them, R works perfectly normal
It must be some other issue. In this example I've used 21 variables without any problem.
@@bkrai Thanks! But the error I get is: attempt to find suitable starting values failed
In addition: Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
hi sir, how do you know the variable Max is causing the warning?
+lauualb it was based on trial and error.
what if the intercept is insignificant
That's ok, we should still keep it.
Hi Sir, It's a nice video, I always follow you other videos, they are very good.
I am running the ordinal LR on my own data i.e., insurance to find the EMlevel and this dependent variable contains 6 levels i.e., 1,2,3....6. So as per your instructions I converted EMlevel variable to ordered and str is appearing as "EMLevel : Ord.factor w/ 6 levels "1"
Hello sir,
I am getting the following error
"Error in optim(s0, fmin, gmin, method = "BFGS", ...) :
initial value in 'vmmin' is not finite
In addition: Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred"
can you explain this?
Send the codes that you used to look at.
mod
@Rahul Kadge could you find a solution for this error, I got the same and would like to know how you solved it. Thanks,