Is this a valid intuition? When no. of parameter is close to dataset size(d) maybe every parameter goes to predict for a single datapoint, i.e. every parameter is fitting to a different data point and that causes huge overfitting, later when we operate in the over=parameterised regime the parameters just generalise.
May I know where can I get the slide from the lecture?
Hi there, try the syllabus and course materials of the course website: cs229.stanford.edu/syllabus-spring2022.html
Is this a valid intuition?
When no. of parameter is close to dataset size(d) maybe every parameter goes to predict for a single datapoint, i.e. every parameter is fitting to a different data point and that causes huge overfitting, later when we operate in the over=parameterised regime the parameters just generalise.