Surely one of the best and detailed tutorials available on youtube on the topic. One thing that's bothering me from the last lecture is, let's say we want to somehow use these topological features to the machine learning algorithm, we would need feature corresponding to every data point whereas these feature like persistence diagram or Betti curve are defined on the complete set of data points. Let us say we want to pass the Betti curve to an SVM or maybe any other topological descriptor to some machine learning algorithm, for a data set we will have only one of them, and then how can we learn using only one data point. I am not sure where am I missing the link
With TDA, you are typically making claims about the full data set. So, for example, every graph on your data set is assigned one Betti curve and you make predictions about the whole collection of graphs.
@@Pseudomanifold Just to clarify so in the current scenario, we cant perform tasks like Node classification or link prediction when we talk about graphs. In a more general setting, we can perform some tasks if we have a set of point clouds where each point cloud represents one element.
@@PintuKumar-qd9ym Yes, that's right Node classification requires a different setup. It can be done but it's slightly more involved, so I did not cover it here yet.
I ran into this exact problem while trying to use TDA with a toy dataset, this was one solution that might be worthwhile: Say we have some features X and a target y. For some feature F with n classes and a sufficient number of points in each class, one could group X by F and then calculate topological features from each group. Then map those n topological features to X corresponding to the class label of F for each row. This might be useful if you suspect your feature F to be strongly correlated with y. If your feature F is continuous, you could even bin your data to give n classes with a sufficient number of points in each class. Where I'm stuck now is what constitutes "a sufficient number of points in each class"? I suppose you could add the number of bins n to your hyperparameter tuning but given the computational costs, it's not an ideal solution.
Great series of lectures! Would love to see more of your videos on TDA.
Thanks! Stay tuned; I intend to put more talks out
Thank you so much for the great lecture series!
Greetings, how do you in practice compute A^X[\pi^X] ? Finally: Great sequence of lectures, really well done!
It's computed directly as a byproduct of the latent space fit. Check out our code for more details: github.com/BorgwardtLab/topological-autoencoders
Surely one of the best and detailed tutorials available on youtube on the topic. One thing that's bothering me from the last lecture is, let's say we want to somehow use these topological features to the machine learning algorithm, we would need feature corresponding to every data point whereas these feature like persistence diagram or Betti curve are defined on the complete set of data points. Let us say we want to pass the Betti curve to an SVM or maybe any other topological descriptor to some machine learning algorithm, for a data set we will have only one of them, and then how can we learn using only one data point. I am not sure where am I missing the link
With TDA, you are typically making claims about the full data set. So, for example, every graph on your data set is assigned one Betti curve and you make predictions about the whole collection of graphs.
@@Pseudomanifold Just to clarify so in the current scenario, we cant perform tasks like Node classification or link prediction when we talk about graphs. In a more general setting, we can perform some tasks if we have a set of point clouds where each point cloud represents one element.
@@PintuKumar-qd9ym Yes, that's right
Node classification requires a different setup. It can be done but it's slightly more involved, so I did not cover it here yet.
I ran into this exact problem while trying to use TDA with a toy dataset, this was one solution that might be worthwhile:
Say we have some features X and a target y. For some feature F with n classes and a sufficient number of points in each class, one could group X by F and then calculate topological features from each group. Then map those n topological features to X corresponding to the class label of F for each row. This might be useful if you suspect your feature F to be strongly correlated with y. If your feature F is continuous, you could even bin your data to give n classes with a sufficient number of points in each class.
Where I'm stuck now is what constitutes "a sufficient number of points in each class"? I suppose you could add the number of bins n to your hyperparameter tuning but given the computational costs, it's not an ideal solution.
@@dyllanusher1379 you might want to look into Mapper, then. This is an algorithm for analysing data sets with continuous features.