07 - Hash Tables (CMU Intro to Database Systems / Fall 2022)

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ธ.ค. 2024

ความคิดเห็น • 15

  • @foxl9195
    @foxl9195 2 ปีที่แล้ว +6

    What's that Hacker News Driven Development database? 44:07

  • @foxl9195
    @foxl9195 2 ปีที่แล้ว +2

    To the student statement (49:57) on Cuckoo hashing, IIUC: I think if we add new hash tables on cycle detection instead of resizing existing tables, we end up with non-constant lookup time complexity, because we need to compute more and more hashes as time goes on.

    • @olegpatraschku3736
      @olegpatraschku3736 2 ปีที่แล้ว

      To me for e.g. it wasn't clear how we solve cycles, what if we resized the existing tables but our keys got rehashed again at some unfavorable positions thus creating a new cycle and so on we can keep resizing the tables until we run out of memory ....

    • @foxl9195
      @foxl9195 2 ปีที่แล้ว +1

      @@olegpatraschku3736 I believe it can happen with extremely low probability if your hash functions are good

  • @yuejiat0416
    @yuejiat0416 ปีที่แล้ว

    In 21:20 when they are discussing the fact that there is a drop in throughput once the hash function meets a key size of power of 2, I still don't understand why there is a drop. Could someone explain this? Thanks!

  • @rohandvivedi
    @rohandvivedi ปีที่แล้ว

    Hi, I am also interested in knowing what are different strategies in structuring the bucket pointers/or buckets in the disk based hashtable. I mean how to keep track of lets say 10,000 buckets, when the page_size only permits 100 bucket pointers on the page.

  • @foxl9195
    @foxl9195 2 ปีที่แล้ว +1

    So Robin Hood hashing and Cuckoo hashing are both read optimized, while Linear probe hashing is more of a balanced approach?

  • @nipulsindwani117
    @nipulsindwani117 2 ปีที่แล้ว

    at 35:02 after inserting G , I am interested how the next round of search for hash value of D works because according to me it should break at G as not found

    • @alexalexfff
      @alexalexfff 2 ปีที่แล้ว

      It'l scan untill empty slot at pos 1 and return nothing

  • @foxl9195
    @foxl9195 2 ปีที่แล้ว

    Don't we *always* need to look at the global counter number of bits? I think it's confusing when you say that you need to consider less bits than the global counter for some buckets (1:07:45). AFAIU, the local counter matters only when we need to split. If it equals to the global counter, we grow the table and split the bucket; if not, we just split the bucket. In all other cases we don't care about the local counter at all. Correct me if i'm wrong, please.

    • @ingsuifon8178
      @ingsuifon8178 ปีที่แล้ว +1

      I think you are right, and Andy has said "you needn't implement the local counter", it's just for demonstration.

  • @foxl9195
    @foxl9195 2 ปีที่แล้ว

    For linear hashing, if the table is large enough, and we choose the (splitPointer-1) bucket and then keep putting new keys into it, would we ever get back to it again? I feel, if we keep overflowing it over and over again, the table would grow exponentially, and the algorithm would never get a chance to split that one bucket.

  • @Am-fb6ng
    @Am-fb6ng 2 ปีที่แล้ว +3

    why is he wearing a mask?

    • @w0nnafight
      @w0nnafight ปีที่แล้ว +1

      it would be extremely painful otherwise

    • @ahuachapan2
      @ahuachapan2 ปีที่แล้ว

      Something called COVID was unleashed and spread like a wildfire. The other option is that he wants to protect his identity 'cause it is a Batman issue.