GNN Project #4.2 - GVAE Training and Adjacency reconstruction

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ธ.ค. 2024

ความคิดเห็น • 24

  • @ILoveMattBellamy
    @ILoveMattBellamy ปีที่แล้ว

    Approximately half way through this series, but had to take a pause and congratulate you on the amazing content! The visuals and your commentary throughout are exceptional. Thank you for sharing your work! Please keep it up!

  • @DhananjaySarkar-y3i
    @DhananjaySarkar-y3i 2 หลายเดือนก่อน

    One of the best video i have ever seen

  • @zijiali8349
    @zijiali8349 ปีที่แล้ว

    on 10:42, could you explain why you multiplied latent_embedding_size by 2 as your input dim for decoder layers?

  • @danieladebi7665
    @danieladebi7665 2 ปีที่แล้ว +1

    @12:17 Shouldn't we be taking the square root of the std variable to get the actual standard deviation, assuming we're getting the log of the variance then exponentiating that? Or does this not actually matter here? Also great series!

    • @DeepFindr
      @DeepFindr  2 ปีที่แล้ว +1

      Hi!
      This is actually not required, because
      log(Sigma ** 2) = 2 * log(sigma)
      Because 2 is just a constant we can also simply use the standard deviation. :) hope this makes sense

    • @danieladebi7665
      @danieladebi7665 2 ปีที่แล้ว

      @@DeepFindr Ok this makes sense, thank you!

  • @변진영-v7j
    @변진영-v7j 2 ปีที่แล้ว

    The best video what I've ever seen!

  • @Ee-ki7cn
    @Ee-ki7cn ปีที่แล้ว +1

    if i use the InnerProduct as the decoder, must the shape of the adjacency matrix be identical for every input graph? i.e. i have a training set of graphs with different nodes and varying number of nodes. Can i just take the graphs as they are and use the InnerProduct.

    • @Ee-ki7cn
      @Ee-ki7cn ปีที่แล้ว

      and how to handle batching alternatively in the training function without using batchnorm in the model.???
      Thank you

  • @kianpu6242
    @kianpu6242 3 ปีที่แล้ว

    @16:32 what is the reason of having three repeated decoder layer? I noticed you didn't pass in x to the second dense layer after the first dense layer. Instead, you passed in inputs.

    • @DeepFindr
      @DeepFindr  3 ปีที่แล้ว +1

      Hi! Yes this error is already fixed in the code, but had no impact on the result.
      There is no specific reason for using three layers. It's just to give the model more depth and parameters.
      Cheers

  • @nicolasf1219
    @nicolasf1219 5 หลายเดือนก่อน

    Would this also work on large graphs?

  • @artistworking7755
    @artistworking7755 2 ปีที่แล้ว

    Hi! Thanks for sharing, may I ask why there are no activation layers in the decoder functions (i.e self.edge_decode(z) ). I have also seen that in the original GVAE the decoding is done by sigmoid(z@z.t()), where z is the latent representation of the graph. I am trying to understand these reconstruction losses and trying to figure out which one is best for reconstructing a non-binary adjacency matrix.

    • @DeepFindr
      @DeepFindr  2 ปีที่แล้ว

      Hi! The final projections have no activation because they should output plain logit values that are then fed into the argmax. Would maybe make sense to also put a softmax in between to sharpen the distribution.
      The shared layers before apply relu and allow for nonlinearity

    • @DeepFindr
      @DeepFindr  2 ปีที่แล้ว

      The other way to decode is called "dot product decoder" and is just another way. This approach however creates similar embeddings for nodes that should be connected. This puts more emphasizes on similar neighborhoods which doesn't necessarly need to hold for connections.
      I think the projection head is adding more flexibility, but I have not empirically evaluated this.

  • @liongkhaijiet5014
    @liongkhaijiet5014 3 ปีที่แล้ว

    The concepts and reasons are well explained. But i have one question, do we just take the whole matrix instead of the triangular upper matrix if our graph is directed?

    • @DeepFindr
      @DeepFindr  3 ปีที่แล้ว +1

      Yes. Also you can ignore the diagonal if you don't have self loops.

  •  3 ปีที่แล้ว

    it's great series. I learn a lot from you with thanks. Is that possible to have a GNNs tutorial for text classification?? :)

  • @stevenrodrig14
    @stevenrodrig14 2 ปีที่แล้ว

    Great video! Just wondering why you chose MLP layers of size 128? Is 128 the total number of possible combinations in the adjacency matrix?

    • @DeepFindr
      @DeepFindr  2 ปีที่แล้ว

      Hi! Thanks :)
      This is just the embedding dimension. The adjacency matrix is predicted by taking all possible combinations of these embeddings per node. This number is however dynamic - one graph has 10 nodes another one has 15. So the size of the adjacency matrix changes from graph to graph.

  • @lennarth.3270
    @lennarth.3270 2 ปีที่แล้ว

    Hey, I am currently trying to rebuild the system. Unfortunately, the website you linked on your github to get the dataset somehow looks a bit fishy and redirects me to some other websites. I was wondering if I can get the data somewhere else? I really would appreciate it :)

    • @DeepFindr
      @DeepFindr  2 ปีที่แล้ว

      Hi!
      Was it this MoleculeNet dataset?
      moleculenet.org/
      Thanks for this info!

    • @DeepFindr
      @DeepFindr  2 ปีที่แล้ว

      Yep that link is outdated. I updated it, thanks!

    • @lennarth.3270
      @lennarth.3270 2 ปีที่แล้ว

      @@DeepFindr Thank you : )