Hey Lukas, Those are GREAT questions! First off, this video is done for an introductory statistics class that doesn't dive too deep into this stuff, so it was meant mainly as an informational video so that they knew these types of analyses existed, but it wasn't intended to really teach the concepts that much. From the level of your questions, I would guess that you are much farther along! As to the dummy variable point - yes, it would, and I do have a dummy variable for gender (it is called "sex") At 11:21 I show the table of values and you will see gender of the child. At 11:36 you will see the logistic regression equation with "height of child = 25.6 + 0.377(height of mother) + 0.195(height of father) + 4.15(SEX) and the SEX variable is the gender of the child. As for using a quadratic to predict growth. Yes, growth is normally THOUGHT of as being exponential, but the data set we had presented a shape that was more closely MODELED by a quadratic. Go to 13:55 and you will see in the table that the exponential model had a slightly smaller correlation coefficient. This reminds us that prediction really only works well within a small window. Our data was only from 1800 to 2000 with only 11 pieces of data! If we had more pieces of data it is highly likely that an exponential would end up fitting it better.
+DrCraigMcBridePhD Thanks for the concise video, I had a question about your comment regarding the influence the order of inputing the predictor variables has on their importance in the model. In the equation Daughter = 7.5 + .707 Mother + .164 Father, I gathered that if the father variable was input first it may have taken on a value closer to .707 and the mother might have take a value closer to .164. How do you account for this potential bias in the process and how often do you encounter this problem? Is it simply putting in variables that logically seem to have more influence in before others?
hi, it seems best. how can be determined critical period of weed competition in crops following non linear regression equations (Gompertz and Logistic equations)under different periods of weed competition and weed free periods in SAS? the synthax used also.
Dear Dr Craig Mc , first of all I would like to say thanks for your video. But if possible I have on question " how can I analyise my data set in non linear least square model?'. let me introduce you about my data set . My deprndent variable is return on asset(ROA) which include negative numbers. And my indpenent varibles are degree of operating leverage, degree of financial leverage and total leverage. there is normality problem in my data and my analysis is panel data analysis. so please help me in my thesis work. Thanks!
I have a question: what analysis should I perform if I want to check if a dependent variable (Y) can be described by several independent variables (in my case three (a, b and c) by using a model with quadratic equation. Example: Y = 5 + 2a - 3b + 6c + 2a*b - 8a^2
Gina Miranda If you have actual data, you can simply compare the observed Ys to the predicted Ys and see how good a model the equation is. If you want to do it in a more "official" way, you will need to compute the multiple regression R and R-squared etc.
hey, i have a question, what if i have no idea what function describes my set of data? Isnt there a way of figuring it out? For example, i have two independent variables that have influence over a dependent one, how can i come up with a equation that better describe it?
You really have to rely on technology to do it for you. If you are familiar with SAS or R, you can use them to compute the best model possible. Unfortunately, with multiple regression, you can no longer rely on a simple scatterplot to show you the relationship. Some programs can give you 3-D plots for multiple variables, but they can be tricky to understand.
Melentiev Ruslan the variable letters don't matter. All that matters is the level of measurement. If the variables are all ratio level, nothing changes. If they are ordinal or other levels, you may need to run logistic regression or log regression etc
How to do the nonlinear regression of this database? I tried to use the logistic growth function and I couldn't. The function of logistical growth is Theta1 + (Theta2 - Theta1) / (1 + exp(( X - Theta3) /Theta4 ) ) . The database follows below: Year Production 1971,00 40307,355 1972,00 41224,341 1973,00 40772,722 1974,00 40354,975 1975,00 42628,578 1976,00 47343,76 1977,00 52560,846 1978,00 57252,182 1979,00 62749,646 1980,00 65117,867 1981,00 62518,976 1982,00 58409,506 1983,00 57868,732 1984,00 57234,72 1985,00 56936,361 1986,00 57738,2 1987,00 57346,604 1988,00 55884,737 1989,00 51317,959 1990,00 43725,048 1991,00 38653,177 1992,00 34887,414 1993,00 32659,597 1994,00 32640,717 1995,00 34849,654 1996,00 36473,318 1997,00 35980,925 1998,00 33753,108 1999,00 34206,223 2000,00 36641,719 2001,00 37057,075 2002,00 36471,8 2003,00 38382,812 2004,00 41422,114 2005,00 41250,078 2006,00 41230,963 2007,00 43446,29 2008,00 46860,677 2009,00 52334,368 2010,00 54457,968 2011,00 58310,146 2012,00 63291 2013,00 63082 2014,00 68974 2015,00 69966 2016,00 66087 2017,00 71113 Note: I tried to reproduce the results obtained from the link below in the Minitab software and I was not able to. www.roperld.com/science/uranium.htm
Yes, but the computations are horrendous! It is best to let some form of technology do it for you. If you run the data in SAS or R, it will produce the equation for you.
no. You can have a nonlinear regression line that is fitted to a single predictor variable. You can also have multiple regression lines of best fit that are linear! The shape and number of predictor variables are independent of each other.
Hey Lukas,
Those are GREAT questions! First off, this video is done for an introductory statistics class that doesn't dive too deep into this stuff, so it was meant mainly as an informational video so that they knew these types of analyses existed, but it wasn't intended to really teach the concepts that much. From the level of your questions, I would guess that you are much farther along! As to the dummy variable point - yes, it would, and I do have a dummy variable for gender (it is called "sex") At 11:21 I show the table of values and you will see gender of the child. At 11:36 you will see the logistic regression equation with "height of child = 25.6 + 0.377(height of mother) + 0.195(height of father) + 4.15(SEX) and the SEX variable is the gender of the child.
As for using a quadratic to predict growth. Yes, growth is normally THOUGHT of as being exponential, but the data set we had presented a shape that was more closely MODELED by a quadratic. Go to 13:55 and you will see in the table that the exponential model had a slightly smaller correlation coefficient. This reminds us that prediction really only works well within a small window. Our data was only from 1800 to 2000 with only 11 pieces of data! If we had more pieces of data it is highly likely that an exponential would end up fitting it better.
+DrCraigMcBridePhD
Thanks for the concise video, I had a question about your comment regarding the influence the order of inputing the predictor variables has on their importance in the model. In the equation Daughter = 7.5 + .707 Mother + .164 Father, I gathered that if the father variable was input first it may have taken on a value closer to .707 and the mother might have take a value closer to .164. How do you account for this potential bias in the process and how often do you encounter this problem? Is it simply putting in variables that logically seem to have more influence in before others?
hi, it seems best. how can be determined critical period of weed competition in crops following non linear regression equations (Gompertz and Logistic equations)under different periods of weed competition and weed free periods in SAS? the synthax used also.
Dear Dr Craig Mc , first of all I would like to say thanks for your video. But if possible I have on question " how can I analyise my data set in non linear least square model?'. let me introduce you about my data set . My deprndent variable is return on asset(ROA) which include negative numbers. And my indpenent varibles are degree of operating leverage, degree of financial leverage and total leverage. there is normality problem in my data and my analysis is panel data analysis. so please help me in my thesis work.
Thanks!
I have a question: what analysis should I perform if I want to check if a dependent variable (Y) can be described by several independent variables (in my case three (a, b and c) by using a model with quadratic equation. Example: Y = 5 + 2a - 3b + 6c + 2a*b - 8a^2
Gina Miranda If you have actual data, you can simply compare the observed Ys to the predicted Ys and see how good a model the equation is. If you want to do it in a more "official" way, you will need to compute the multiple regression R and R-squared etc.
hey, i have a question, what if i have no idea what function describes my set of data? Isnt there a way of figuring it out? For example, i have two independent variables that have influence over a dependent one, how can i come up with a equation that better describe it?
You really have to rely on technology to do it for you. If you are familiar with SAS or R, you can use them to compute the best model possible. Unfortunately, with multiple regression, you can no longer rely on a simple scatterplot to show you the relationship. Some programs can give you 3-D plots for multiple variables, but they can be tricky to understand.
So how to build Multiple Nonlinear Regression equation (with different variables like x, y, z, d, r, ...m, not x, x^2, x^3, x^n)?
Melentiev Ruslan the variable letters don't matter. All that matters is the level of measurement. If the variables are all ratio level, nothing changes. If they are ordinal or other levels, you may need to run logistic regression or log regression etc
How to do the nonlinear regression of this database? I tried to use the logistic growth function and I couldn't.
The function of logistical growth is Theta1 + (Theta2 - Theta1) / (1 + exp(( X - Theta3) /Theta4 ) ) .
The database follows below:
Year Production
1971,00 40307,355
1972,00 41224,341
1973,00 40772,722
1974,00 40354,975
1975,00 42628,578
1976,00 47343,76
1977,00 52560,846
1978,00 57252,182
1979,00 62749,646
1980,00 65117,867
1981,00 62518,976
1982,00 58409,506
1983,00 57868,732
1984,00 57234,72
1985,00 56936,361
1986,00 57738,2
1987,00 57346,604
1988,00 55884,737
1989,00 51317,959
1990,00 43725,048
1991,00 38653,177
1992,00 34887,414
1993,00 32659,597
1994,00 32640,717
1995,00 34849,654
1996,00 36473,318
1997,00 35980,925
1998,00 33753,108
1999,00 34206,223
2000,00 36641,719
2001,00 37057,075
2002,00 36471,8
2003,00 38382,812
2004,00 41422,114
2005,00 41250,078
2006,00 41230,963
2007,00 43446,29
2008,00 46860,677
2009,00 52334,368
2010,00 54457,968
2011,00 58310,146
2012,00 63291
2013,00 63082
2014,00 68974
2015,00 69966
2016,00 66087
2017,00 71113
Note: I tried to reproduce the results obtained from the link below in the Minitab software and I was not able to.
www.roperld.com/science/uranium.htm
Yes, but the computations are horrendous! It is best to let some form of technology do it for you. If you run the data in SAS or R, it will produce the equation for you.
How did you enter the data into minitab to get the linear regression equation?
I didn't. It was a print-out from a textbook. I use SAS or R, but I am sure there are Minitab tutorials online you can find.
multiple non linear regression and non linear regression both are same?
no. You can have a nonlinear regression line that is fitted to a single predictor variable. You can also have multiple regression lines of best fit that are linear! The shape and number of predictor variables are independent of each other.
@@craigmcbride2135 thank youu