One thing I'm wondering... Say we were looking at one continuous factor x and a reponse y. The true relation between y ~ x is y = x^2 (unknown to us of course). Say, for whatever reason we decide to take the levels of x to be {-2, 2} and measure the response. We observe that in both cases, the response y is the same, thus effect_x=0 and we would incorrectly count it as insignificant. How do we avoid this mistake? Let me give a concrete example: In the "comfort ~ humidity, temperature" example you keep using, when you run ANOVA on that example, you get a p-val of something like .1 and .2 for the factors, neither of which are significant and so if we were just looking at these p-val's, we'd throw out the terms. However, later you run a few more experiments at the star points for these terms and fit a response surface. We see clearly now that the comfort is quadratic function of the factors and they are important. How would we avoid throwing out possibly important factors too early?
Most experiments DO NOT start with a vacuum. It usually has some clues/expectations on the output. Model fitting is sometimes more of an art than science. However, justifying a hypothesis without having some research or understanding of the variables could be very costly. If we have to divide the total time between making a reasonable hypothesis and rest of the work (e.g., collecting data, running the analysis and interpreting them) I would set more for making a hypothesis than the rest of the experiment. We still can't ensure that we could avoid the situation that you have mentioned. Great question. Thanks!
One thing I'm wondering... Say we were looking at one continuous factor x and a reponse y. The true relation between y ~ x is y = x^2 (unknown to us of course). Say, for whatever reason we decide to take the levels of x to be {-2, 2} and measure the response. We observe that in both cases, the response y is the same, thus effect_x=0 and we would incorrectly count it as insignificant. How do we avoid this mistake?
Let me give a concrete example: In the "comfort ~ humidity, temperature" example you keep using, when you run ANOVA on that example, you get a p-val of something like .1 and .2 for the factors, neither of which are significant and so if we were just looking at these p-val's, we'd throw out the terms. However, later you run a few more experiments at the star points for these terms and fit a response surface. We see clearly now that the comfort is quadratic function of the factors and they are important. How would we avoid throwing out possibly important factors too early?
Most experiments DO NOT start with a vacuum. It usually has some clues/expectations on the output. Model fitting is sometimes more of an art than science. However, justifying a hypothesis without having some research or understanding of the variables could be very costly. If we have to divide the total time between making a reasonable hypothesis and rest of the work (e.g., collecting data, running the analysis and interpreting them) I would set more for making a hypothesis than the rest of the experiment. We still can't ensure that we could avoid the situation that you have mentioned. Great question. Thanks!
@@TheOpenEducator Gotcha. Thanks!