I guess this might be part of your next video, but this one leaves some very important things unsaid. 1. There are different purposes for hashing. Cryptographic hash function has very different requirements than hash function using for hash tables. For example, having a collision in a hash-map only slightly degrades its performance. Having many collisions in a hash map is bad. Having even 1 collision in crypto hash is really bad. 2. Collisions are not always bad. SimHash are used to compute similarity between sets. Having collisions in this case is part of the design. 3. Don't store passwords using MD5! Or even SHA-256. B-Crypt is designed for this, so it's fine. In fact, MD5 should not be used in any crypto context because it's broken.
In context of your points, I think a couple of them might not be on point. Allow me to clarify: 1. Cryptographic hash functions (CHF) don't have "very different" requirements. Rather CHFs are specific types of hashing algorithms - meaning, they have "additional" requirements (not different). The existing requirements (or desirability traits) still stand: they still MUST be deterministic, they still SHOULD produce an output of fixed length (some exceptions) and they still MUST be irreversible. Coming to the point of HashMaps (or HashTables in some languages) - the reason it only "slightly" degrades the performance is because we usually don't store a lot of data in HashMaps (referring to the in-memory data structure, not to be confused with hash-based database indexes). Some object oriented programming languages also allow you to define your own hash functions for object hashing. In such a case, it is still possible to define a (bad) hash function which returns a constant value resulting in all data colliding and going into the same hash bucket. It can, in this case defeat the purpose of HashMap. So its less about the HashMap as a data structure itself but more about the amount of data and the hashing algorithm underneath which define whether the collisions degrade the performance "slightly" or "significantly". 3. Agreed with MD5 as a bad CHF. However don't agree with SHA-256. Storing user passwords with SHA-256 is not the best choice and BCrypt is (argon2 even better). However - using service-to-service authentication via SHA-256 between two microservices over an internal network which is not exposed to the internet is quite optimal when considering credential rotation in a practical sense. However, if we use BCrypt - it can significantly degrade service-to-service auth. So - its more about the "right fit" in the "right use case". -- It is easy to get carried away by a lot of information, but the purpose of this video was to introduce people to the concept of hashing and lay a good foundation in a way that doesn't overwhelm them :)
@@MonisYousuf First, and I should've said that before, I actually like your videos. It is hard to find a balance between the making a tutorial video too hard and too shallow. And your videos generally do a good job! I just think that this particular has a couple of problems. Perhaps I didn't state my points precisely enough, for which I apologize. When I say that CHF has different requirements than hash function for hash table (I'll call these HTF) I meant the additional requirements. Specifically it has to be practically impossible to find a collision in CHF. This is not a requirement for HTF. HTF has to be very fast, otherwise the hash table operations will be very slow. It's one of the pitfalls of writing your own HTF, while for password-hashes like BCrypt it's better to be relatively slow to make brute force attacks more expensive. Regarding the degradation, I meant that if there are few collisions in HTF you'd potentially have better performance, while if there are lots of collisions in HTF, you're guaranteed to have bad performance. There are other factors, like how collision resolution is implemented, but this goes beyond the hash function itself. The amount of data in the hash table is not so directly related to the speed degradation IMO. Normally hash tables are programmed to double their size if the amount of data stored goes above some predefined threshold, so they are never "full". For SHA-256 I meant that it's bad specifically for storing passwords. I agree that explaining this probably is not for this video. But perhaps choosing a different example would have been better. The current one just suggests that it's OK to do so, which I think is bad. I agree that an introductory video, such as this one, shouldn't go too deep because it will be more confusing. But I still think that it should explain that there are different applications for hash functions, otherwise people might start using SHA-256 for hash maps :)
Salam alykom friend , I'm no editing expert but the audio my sound better if you allow yourself to finish sentence at the start of the video? no? other than that great explanation
Succinctly explained. Thanks so much!
Very good explanation on the topic, this moment when you explained the usage of the words was perfect to me
excellent video. that was really understandable
This was a very informative video. Thanks!
Thank you so much for this video! I understand my course video now! 😊
Awesome video 🔥
Great explanation..Thanks!!
Great content , extremely easy to understand the topic , keep up the great work dude🔥
Thank you for this wonderful content.
I really appreciate your work. Easily understandable, even to a school boy
Thx for awesome content, Monis!
Waiting for next video with realtime examples ..
I guess this might be part of your next video, but this one leaves some very important things unsaid.
1. There are different purposes for hashing. Cryptographic hash function has very different requirements than hash function using for hash tables. For example, having a collision in a hash-map only slightly degrades its performance. Having many collisions in a hash map is bad. Having even 1 collision in crypto hash is really bad.
2. Collisions are not always bad. SimHash are used to compute similarity between sets. Having collisions in this case is part of the design.
3. Don't store passwords using MD5! Or even SHA-256. B-Crypt is designed for this, so it's fine. In fact, MD5 should not be used in any crypto context because it's broken.
In context of your points, I think a couple of them might not be on point. Allow me to clarify:
1. Cryptographic hash functions (CHF) don't have "very different" requirements. Rather CHFs are specific types of hashing algorithms - meaning, they have "additional" requirements (not different). The existing requirements (or desirability traits) still stand: they still MUST be deterministic, they still SHOULD produce an output of fixed length (some exceptions) and they still MUST be irreversible. Coming to the point of HashMaps (or HashTables in some languages) - the reason it only "slightly" degrades the performance is because we usually don't store a lot of data in HashMaps (referring to the in-memory data structure, not to be confused with hash-based database indexes). Some object oriented programming languages also allow you to define your own hash functions for object hashing. In such a case, it is still possible to define a (bad) hash function which returns a constant value resulting in all data colliding and going into the same hash bucket. It can, in this case defeat the purpose of HashMap. So its less about the HashMap as a data structure itself but more about the amount of data and the hashing algorithm underneath which define whether the collisions degrade the performance "slightly" or "significantly".
3. Agreed with MD5 as a bad CHF. However don't agree with SHA-256. Storing user passwords with SHA-256 is not the best choice and BCrypt is (argon2 even better). However - using service-to-service authentication via SHA-256 between two microservices over an internal network which is not exposed to the internet is quite optimal when considering credential rotation in a practical sense. However, if we use BCrypt - it can significantly degrade service-to-service auth.
So - its more about the "right fit" in the "right use case".
--
It is easy to get carried away by a lot of information, but the purpose of this video was to introduce people to the concept of hashing and lay a good foundation in a way that doesn't overwhelm them :)
@@MonisYousuf First, and I should've said that before, I actually like your videos. It is hard to find a balance between the making a tutorial video too hard and too shallow. And your videos generally do a good job! I just think that this particular has a couple of problems.
Perhaps I didn't state my points precisely enough, for which I apologize. When I say that CHF has different requirements than hash function for hash table (I'll call these HTF) I meant the additional requirements. Specifically it has to be practically impossible to find a collision in CHF. This is not a requirement for HTF. HTF has to be very fast, otherwise the hash table operations will be very slow. It's one of the pitfalls of writing your own HTF, while for password-hashes like BCrypt it's better to be relatively slow to make brute force attacks more expensive.
Regarding the degradation, I meant that if there are few collisions in HTF you'd potentially have better performance, while if there are lots of collisions in HTF, you're guaranteed to have bad performance. There are other factors, like how collision resolution is implemented, but this goes beyond the hash function itself. The amount of data in the hash table is not so directly related to the speed degradation IMO. Normally hash tables are programmed to double their size if the amount of data stored goes above some predefined threshold, so they are never "full".
For SHA-256 I meant that it's bad specifically for storing passwords. I agree that explaining this probably is not for this video. But perhaps choosing a different example would have been better. The current one just suggests that it's OK to do so, which I think is bad.
I agree that an introductory video, such as this one, shouldn't go too deep because it will be more confusing. But I still think that it should explain that there are different applications for hash functions, otherwise people might start using SHA-256 for hash maps :)
Salam alykom friend , I'm no editing expert but the audio my sound better if you allow yourself to finish sentence at the start of the video? no? other than that great explanation
lol, im devops engineer and watchig this