Probably because I'm working in C and don't have things like sets and list-slicing available by default I came up with a different implementation of what is basically the same concept for part 2: keep a collection of running-totals for what each possible sequence of 4 differences found in the random sequence would have made. Storing only the last and current value for getting the difference, I turned each difference into a value of 0 to 18 which becomes part of a running hash by multiplying the previous hash by 19, modulo 130321, and adding the new part. This can be used as an index into an array of running-totals, each one tagged with the ID (input line number) of the last monkey that added to that pile, so we only use the first occurrence of a sequence for a given monkey. 130321 isn't that big for an array of 32-bit ints,
Someone commented this on one of the earlier days as well, but if b is a power of 2, a % b equals a & (b -1). No need to fiddle with bits or hex numbers
Thanks for all your videos! I map the tuples of 4 changes to a 20-bit number (each change is between -9 to +9, that can be transformed to 0 to 18 which fits into 5 bits) that allows me to use 1d arrays to track a) whether we have already seen the tuple for the current buyer and b) overall income. Run time: 0.06s for both parts with pypy.
I did the same and then went one step further (20 bits still needs over a million entries and it annoyed me to waste bits). You can fit the whole sequence into 17bits. Downside is that you can't use bit operations anymore. Instead of ((sequence
@@TheTrienco You really need only one (global) 'seen' array, not one for each buyer. Simply store in that array the latest buyer id for which you encountered a particular sequence (buyer 1, 2, etc).. If the stored buyer id is less than the current buyer's id, then we have not yet seen the sequence for the current buyer (and we'll set seen[seq] = current_buyer_id).
@@mathiaskern7031 Oh, that is actually a good way to avoid reallocating or clearing each time. I tried changing it with the current implementation and it does seem to buy me around 0.5ms on average. I expected a bigger difference, actually. It changes a vector allocated in each iteration to a single vector. The allocations and overhead of the access (it's a bitset under the hood) should make your version a lot faster, since the only benefit of the local vector is size and the only vague idea I have is "maybe more cache hits". Edit: when I'm not an idiot and apply the change to the previous (20bit) version, it does go down to around 12.5ms. The much more drastic change is in unoptimized debug builds. From noticeable seconds to 50ms.
Not sure if this helps, but some things i do differentl: 1. I generate the first 2000 secrets first, then find the prices + the differences, and then populating the different patterns in the map. In this case i would have run n + n +n +n = O(n) but it is still 4n. However, with this approach I found my code run quite instanteneously.
Can you explain this further? Just generating 1600+ lists of 2000 entries, and only then finding the mod 10 and differences can't possibly be faster than keeping track of only the last 4 differences and the last value for each list. How 'instantaneous' was your answer?
@@eavdmeer you're right, my bad. initially I got the impressions that my code was running instantaneously due to the print statements that I have previously. Sorry for the misunderstanding
To be fair, even in other languages this kind of old school micro optimization has been pointless for decades. For example, any compiler that isn't worthless will already replace those operations with bit operations when an argument is a power of two. I'd expect that even the Python interpreter might be doing that. Of course, if those values are runtime values and only the programmer knows that they will happen to always be pot, a compiler can't do that.
after the past days difficulty, today's 22 sliding window counting puzzle felt like a day off for recovery...
Probably because I'm working in C and don't have things like sets and list-slicing available by default I came up with a different implementation of what is basically the same concept for part 2: keep a collection of running-totals for what each possible sequence of 4 differences found in the random sequence would have made.
Storing only the last and current value for getting the difference, I turned each difference into a value of 0 to 18 which becomes part of a running hash by multiplying the previous hash by 19, modulo 130321, and adding the new part. This can be used as an index into an array of running-totals, each one tagged with the ID (input line number) of the last monkey that added to that pile, so we only use the first occurrence of a sequence for a given monkey. 130321 isn't that big for an array of 32-bit ints,
Someone commented this on one of the earlier days as well, but if b is a power of 2, a % b equals a & (b -1). No need to fiddle with bits or hex numbers
Thanks for all your videos! I map the tuples of 4 changes to a 20-bit number (each change is between -9 to +9, that can be transformed to 0 to 18 which fits into 5 bits) that allows me to use 1d arrays to track a) whether we have already seen the tuple for the current buyer and b) overall income. Run time: 0.06s for both parts with pypy.
I did the same and then went one step further (20 bits still needs over a million entries and it annoyed me to waste bits). You can fit the whole sequence into 17bits. Downside is that you can't use bit operations anymore. Instead of ((sequence
@@TheTrienco You really need only one (global) 'seen' array, not one for each buyer. Simply store in that array the latest buyer id for which you encountered a particular sequence (buyer 1, 2, etc).. If the stored buyer id is less than the current buyer's id, then we have not yet seen the sequence for the current buyer (and we'll set seen[seq] = current_buyer_id).
@@mathiaskern7031 Oh, that is actually a good way to avoid reallocating or clearing each time. I tried changing it with the current implementation and it does seem to buy me around 0.5ms on average. I expected a bigger difference, actually. It changes a vector allocated in each iteration to a single vector. The allocations and overhead of the access (it's a bitset under the hood) should make your version a lot faster, since the only benefit of the local vector is size and the only vague idea I have is "maybe more cache hits".
Edit: when I'm not an idiot and apply the change to the previous (20bit) version, it does go down to around 12.5ms. The much more drastic change is in unoptimized debug builds. From noticeable seconds to 50ms.
What plugin do you use to see the errors directly on the line? Looks very useful
Not sure if this helps, but some things i do differentl:
1. I generate the first 2000 secrets first, then find the prices + the differences, and then populating the different patterns in the map. In this case i would have run n + n +n +n = O(n) but it is still 4n. However, with this approach I found my code run quite instanteneously.
Can you explain this further? Just generating 1600+ lists of 2000 entries, and only then finding the mod 10 and differences can't possibly be faster than keeping track of only the last 4 differences and the last value for each list. How 'instantaneous' was your answer?
@@eavdmeer you're right, my bad. initially I got the impressions that my code was running instantaneously due to the print statements that I have previously. Sorry for the misunderstanding
@@heinrich7521 no worries. I was just hoping you'd come up with a better optimization. I can't get beyond making the hash key more efficient
To be fair, even in other languages this kind of old school micro optimization has been pointless for decades. For example, any compiler that isn't worthless will already replace those operations with bit operations when an argument is a power of two. I'd expect that even the Python interpreter might be doing that. Of course, if those values are runtime values and only the programmer knows that they will happen to always be pot, a compiler can't do that.
use defaultdicts instead of dicts