As far as strings are concerned, I usually just use DJB's algorithm. It's super simple and generally does good enough on most strings people are likely to input. If you're doing more for this series, I would suggest implementing a foreach function for the table as well as a sort. My basic implementation of a hash table just stores everything in a single array and then implements the table as a secondary array of arrays. For small purposes, within a few gigs, it works fairly well. I don't often need a container that maintains sorting order and when I need things sorted I can just use qsort(). I allocate using powers of two so normalizing a hash requires only a bitwise `and` and since I save the unnormalized hash I can resize the table on a dime. And I cheat when doing deletions by simply swapping the last element with the one to be deleted and decrement the count. Only requires two updates in the table.
7:42, I would rather create an empty cleanup function to default to that just exits immediately. free is the same number of characters as NULL so the dev should be responsible for specifying it as it costs them nothing to do so.
technically it costs a level on the stack ... that can't be optimized out because the modules are separate (even if you inline the function, an in-binary one is created). Hrm... now in my mind I'm wondering if a smart compiler can just make a trampoline out of a function that is basically "void myfree(void *p){free(p);}" ... Can it just JMP to free (not JSR)?
Hrm... I take that back. compiler explorer seems to say that the compiler can anticipate that a function with nothing but a call can be a jMP rather than a JSR. Still a jump, tho ... and the code is still going to call it since you use a function pointer.
Still bad form recommendation IMHO. If you're simply using a stock function, you should use a stock function. This is also why I'd advocate that plain-old-C should also have lambda.
@@randomscribblings But you still shouldn't assume the data was dynamically allocated, better to demand they declare they want free as the call than to assume they want it and cause a segfault attempting to de-allocate memory that was static or on the stack
@@zxuiji Hrm... My leaning would be: if the definition is calling it a freeing function, then free(3) is an acceptable default. Since you're taking a void pointer, you're going to have to work hard to produce something in C that free would wholesale fsck-up. If the definition were calling it a desctructor (which is OK in C --- you can do OOP in C), then NULL might be a better default. Here's the rub: default-free(3) will be caught at development time ... at the very least with valgrind or equivalent memory checker. Default-NULL could leak memory quietly for decades.
I'm currently trying to use this implementation of a hash table inside of a shared memory segment, so that multiple processes (using fork()) can perform CRUD operations on the same hash table. Can anyone give me a rough outline on how I can achieve this? I am kind of lost here, since I am also relatively new to C.
funnily enough, I've been stuck in python a lot lately --- and this idiom for bypassing nulls is very useful there, doubly so where a null is not an accepted value...
as a cs student i am thankfull for your videos they are top class quality
Thank you. I always learn something from these videos.
As far as strings are concerned, I usually just use DJB's algorithm. It's super simple and generally does good enough on most strings people are likely to input. If you're doing more for this series, I would suggest implementing a foreach function for the table as well as a sort. My basic implementation of a hash table just stores everything in a single array and then implements the table as a secondary array of arrays. For small purposes, within a few gigs, it works fairly well. I don't often need a container that maintains sorting order and when I need things sorted I can just use qsort(). I allocate using powers of two so normalizing a hash requires only a bitwise `and` and since I save the unnormalized hash I can resize the table on a dime. And I cheat when doing deletions by simply swapping the last element with the one to be deleted and decrement the count. Only requires two updates in the table.
Knuth has a lot to say on hashes for hash tables. DJB might be riffing on that, dunno. Knuth is always a good read.
The animation was really nice!
7:42, I would rather create an empty cleanup function to default to that just exits immediately. free is the same number of characters as NULL so the dev should be responsible for specifying it as it costs them nothing to do so.
technically it costs a level on the stack ... that can't be optimized out because the modules are separate (even if you inline the function, an in-binary one is created). Hrm... now in my mind I'm wondering if a smart compiler can just make a trampoline out of a function that is basically "void myfree(void *p){free(p);}" ... Can it just JMP to free (not JSR)?
Hrm... I take that back. compiler explorer seems to say that the compiler can anticipate that a function with nothing but a call can be a jMP rather than a JSR. Still a jump, tho ... and the code is still going to call it since you use a function pointer.
Still bad form recommendation IMHO. If you're simply using a stock function, you should use a stock function. This is also why I'd advocate that plain-old-C should also have lambda.
@@randomscribblings But you still shouldn't assume the data was dynamically allocated, better to demand they declare they want free as the call than to assume they want it and cause a segfault attempting to de-allocate memory that was static or on the stack
@@zxuiji Hrm... My leaning would be: if the definition is calling it a freeing function, then free(3) is an acceptable default. Since you're taking a void pointer, you're going to have to work hard to produce something in C that free would wholesale fsck-up. If the definition were calling it a desctructor (which is OK in C --- you can do OOP in C), then NULL might be a better default. Here's the rub: default-free(3) will be caught at development time ... at the very least with valgrind or equivalent memory checker. Default-NULL could leak memory quietly for decades.
Great video. Could you add this to the Data Structures playlist? Thanks!
I just checked. It's on there.
@@JacobSorberYeah I'm blind as hell. Thanks for the quick reply! Keep up the great work!
I'm currently trying to use this implementation of a hash table inside of a shared memory segment, so that multiple processes (using fork()) can perform CRUD operations on the same hash table.
Can anyone give me a rough outline on how I can achieve this? I am kind of lost here, since I am also relatively new to C.
ht->cleanup = cf ? cf : free; An idiom, I know, but we're talking C here.
funnily enough, I've been stuck in python a lot lately --- and this idiom for bypassing nulls is very useful there, doubly so where a null is not an accepted value...
How's that key leak in delete?