Constraints: Your code will run inside a Python 2.7.13 sandbox. All tests will be run by calling the solution() function. Standard libraries are supported except for bz2, crypt, fcntl, mmap, pwd, pyexpat, select, signal, termios, thread, time, unicodedata, zipimport, zlib. Input/output operations are not allowed. Your solution must be under 32000 characters in length including new lines and and other non-printing characters.
Why not just generate a string of primes with length 10,005 using a shell script and hard code it into the solution, this would make it much much faster
for prime_str, i think if u use the built-in "filter" function instead of comprehension you would gain some speed... it's roughly 15% faster to use filter instead comprehension (15% for that statement only)
Here's my solution to "patch" on the prime string on at runtime. It's a bit hacky, but ideally a function like this would be used with a generator or a class object to remember state. There are improvements that can be made to make it scale since right now it just uses the hardcoded limit of 10_000 def solution(n): if "prime_str" not in dir(solution): max_int = 10_000 number_set = [True] * max_int number_set[0] = number_set[1] = False sieve_end = max_int ** 0.5 for number, is_prime in enumerate(number_set): if number
@@schattenskorpion That's where I think this is a bit hazy. It is completely self contained inside the function and doesn't pollute the global namespace, but it does set a "state" for the function itself. It would really depend on what their meaning of having a stateless function would be
@@NomanKhan-jf6pq Sure, so my implementation follows along almost perfectly with what b00l did, except that it treats the function as an object (since everything in Python is an object under the hood). So during runtime, what happens is the first time this function gets called, the member "solution.prime_str" doesn't exist (that's what the "prime_str" not in dir(solution) does). From here, it follows the same code as provided, and then attaches the prime_str to the solution function. Then, each time the function gets called, it can find the solution.prime_str member and index into that
The code can be improved if you add in an elif which breaks the loop once number > (maxInt**(1/2)). Since there will be no other composite numbers in the list after this condition, the list will not change from this point on and there is no need to check the rest of it to the if statement condition.
Great observation. I see what you mean, instead of checking each number after (maxInt**(1/2)) against the if-statement, knowing each number would be False, just break out of the loop all together.
Why do you use number**2 as the start of the range in line 23? Doesn't this lead to missing multiples of that number. For example with 10 the start would be 100 up to maxInt so you would not find the multiples 20, 30, 40, 50, 60, 70, 80 and 90. Pls explain.
Each of those gets thrown out by 2. You can try this with any number, it’s multiples up to number**2, are already checked by previous numbers multiples. I realize I didn’t explain that part very well.
Another way you can think about this is that 5 * 2 is equivalent to 2 * 5, same with 3 * 5, 4 * 5, etc. The first "new" composition will always be the number times itself, i.e. 5 * 5 or 5**2, so you will always throw out the compositions prior to reaching the squared version of the number.
Guys i am not even close to the level or solving google interview problems. I was decently following along, but I wouldnt imagine a common person just implement this logic and solution this easy and fast right? What i mean i need to do step by step, drawing board, trial and error and maybe than i can get somewhere close, and definitely not even in 30 min, please tell me i am not the only one.
Oh no, this is several hours of thinking all condensed into 10 short minutes! There was ALOT of drawing board, trial and error done in the background. :)
It seems to me, you create a function to determine the string, then call that function in a function to determine the string just once initially. That would not be an import and you only create the string once and you can adapt the string length based on the amount needed with little more lines of code.
@@chri-k To me that seems like really piss poor coding and something that is really beneath Google. At least create a single function that creates the string once and reuses it to calculate all the needed numbers. While that is sloppy coding combining two functions into one, it would work.
@@mind_of_a_darkhorse this is an interview question. They are trying to get you to make a time- efficient algorithm to generate the string at different values - and realize that you need to make it time-efficient by yourself. There is no other way to do this because it is not a practical or good thing to do in and of itself - as you said - and a more complicated scenario does not make for a good interview question. or that’s what i think their reasoning is.
@@chri-k My thought is that on one hand, I get what you're saying, but overall the premise is rather a dumb way to achieve the goal of the intended outcome. I guess there is quite a bit more that I have to learn.
The function isn’t really collision free though, is it? There’s no way the input of 1 to 10,000 returns 10,000 (five digit) unique IDs, given that, in the concatenation, no prime begins or ends with leading zeros, the only way of arriving at something like 00001 is if it were nested in 100,001. The video shows, for the first 10,000 inputs (on the x axis) the search space maxes around primes under 20,000 (y axis), loosely demonstrating the function is not bijective. The commander should known this is not foolproof.
You don't need 4 leading zeros to have 10,000 unique ids though. It's possible to have 10,000 unique ids with 5 random consecutive digits of primes, because there's 100,000 combinations with 5 digits. However, within the first 10.000 digits, there's repeating substrings of up to 9 digits (113127131). So it's possible, but unlikely to get 10,000 unique 5 digit ids.
I think the index is based on the number of primes, not the number of characters, and if I'm not mistaken then you don't need to generate a string from all the numbers, just enough to acquire 5 characters. So I would add 5 to the sieve size and then take the number at the index plus the next 5 and stringify them, then get 5 characters.
using memoization to remember the already computed prime numbers and generate more prime numbers if needed, it dosent use any import or exports, its just saved in memory along with the function
"no imports and exports" was just a badly phrased way to say "the function isn't allowed to save any state between iterations. He explicitly mentioned this strategy and that it's not allowed in the video.
this task was flawed from the beginning. I thought "that can't create unique IDs!", then i tested it. There are 1131 non-unique IDs, e.g. 210 and 448 both have the id 94194
got invited to do their challenge a while ago, made it to stage 4 but failed one sadly ;(. now i know i had to use absorbing markov chains for ditrubited matrices but thats too advanced for me lol. goodluck to those who get it
Yeah the markov chain question was a bit crazy... I recognized it might involve markov chains, then managed to translate the maths from wikipedia into code. I guess that's what you're supposed to do??
I have Google Foobar unlocked but I'm extremely underqualified to take it and succeed. You can thankfully leave it for as long as you like before taking it. No clue when I'll eventually take it
I made it 4 times slower lol: def isprime(n): if n in {2, 3}: return True if n % 2 == 0 or n < 2: return False return all(n % i != 0 for i in range(3, int(n ** 0.5) + 1, 2)) def solution(n): str_len = n + 5 integers = list(range(n, int(str_len * 5000 / 2465 + 476))) prime_str = ''.join([str(x) for x in integers if isprime(x)]) return prime_str[n-1:str_len-1] b001: 22.4 seconds Greg: 83.9 seconds
Hey, first off, I've been totally addicted to your videos lately. Compared to so many other TH-camrs "tips" your's are actually helpful and fun to watch. I know this a random question but may I ask what keyboard you are using or differently what switches it uses because I really like the sound of it :) Again, great job on your videos
I got a new keyboard for Christmas which is the Keychron V2 Wired Custom Mechanical Keyboard Knob Version with Keychron K Pro Red Switches. My previous keyboard was the Skyloong SK61 with gateron red switches
def solution(x): base = "2357111317192329.. return base[x,x+5] 1 to 1000 is not such a long string, just calculate it as pre work to the function later insert that long string as is into the function when the function is called just return by string index nowhere in this question was it said pre work cant be done this solution is O(1)
I can think of two ways of making this run faster by only needing to compute prime_str once. The easiest way would be to save it as an attribute to the function solution(): you just need hasattr(), setattr() and getattr(). Second would be to have a function to compute prime_str, say compute_prime_str(), which computes the value once, then replaces itself in the global namespace with a new function that just returns that computed value. You can do this with globals()["compute_prime_str"] = lambda : prime_str.
Is there a reason why I can't generate the string of primes up to 10005 digits, paste that into a string inside the solution function, and then grab the needed digits via indexing? I feel like this would be the fastest method, but maybe I'm missing something...
"> Your solution must be under 32,000 characters in length including new lines and and other non-printing characters" Welp, 10,005 digits fit with plenty of room to spare. This will be by far the best solution that follows the letter of the requirements. I wonder if people who provide that as an answer would be rewarded or punished.
One basic ideas (not a Python expert, so the compiler might do this already): Only loop over even indices once, then in the main loop iterate in steps of 2. Or better yet, start by constructing an alternating array of True, False, True... ( somewhat like number_set = [False, True] * (n//2) + [True] ) This isn't an asymptotic improvement anyways, so pretty irrelevant. An asymptotic improvement could be achieved by using a better method of finding primes, like The Sieve of Atkin. Edit: Another idea/remark: Id prefer iteratively growing the prime sieve, counting the number of primes with d digits and adding that up. I think your heuristic formula approach is very risky, a good interviewer might ask you to proof that formula. How would you explain its correctness? A heuristic implies knowledge about the distribution of prime numbers, which is one of the greatest unsolved problems in mathematics. If you can't prove your solution to be correct, don't you have a problem? You are essentially giving an upper bound for this problem (several exist afaik), you'd have to explain how you got that. Since n is also given an upper bound, this is pedantic and in this simple case the solution is probably fine.
Am I stupid or can he just create a global variable that holds the string, compute it once, then skip computing every time and just re-use the old string? Theoretically you could also append to the string as needed if a value is input that is higher than the string already calculated. That way you don't have to calculate up to a length 10000 string if a high enough number isn't pulled. Again, a global to hold maxInt would allow you to check that condition and continue calculation.
Honestly I'd have done something similar, but I would have ran separate code to generate the string for n = 10000 and just copy and pasted it into a variable lol. Its hacky but would absolutely cut compute time
here is my optimization over your solution that runs ~30% faster on my machine, compared to your algorythm, although I have to admit that the key point is finding the formula for the maximum prime required. Your formula needs a bit of an optimization on the lower end, but still, it is prety good one. def solution4(n): str_len = n + 5 maxInt = int(str_len * (5000/2465) + 475) numberSet = [True] * maxInt numberSet[0] = numberSet[1] = False sqrt_maxInt = maxInt ** (1/2) number = 2 while number number and _) prime_str = ''.join(str(x) for x,y in enumerate(numberSet) if y) return prime_str[n:str_len]
I shaved off another ~5% by using a for loop instead of a while loop, which looked like: for number in range(2, sqrt_maxInt): if numberSet[number]: for multiple in range(number**2, maxInt, number): numberSet[multiple] = False
@@b001 if you want to be a bit more precise with the formula: maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34) your formula gives a mean value of the deviation from the required maxInt of 380 this one gives 221 overall and for the second portion (n>5000) it even gives 177 in addition to the bellow suggestion for the for loop, the end result is: def solution7(n): str_len = n + 5 maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34) numberSet = [True] * maxInt numberSet[0] = numberSet[1] = False sqrt_maxInt = int(maxInt ** (1/2)) + 1 for number in range(2, sqrt_maxInt): if numberSet[number]: for multiple in range(number**2, maxInt, number): numberSet[multiple] = False prime_str = ''.join(str(x) for x,y in enumerate(numberSet) if y) return prime_str[n:str_len] and runtime drop from 25 seconds (original solution) to 18 seconds(solution4 from previous reply) to 17 seconds (solution7 current reply), on my current machine. If you wonder where this formula comes from, I used the primes needed for n in (0, 4735, 10000) to draw a circle with center at (271533,-121780) and radius 297595. Then I used the y axis of the arc between x=(0,10000) for the maxInt which I padded with 34 (to compensate the inaccuracies of my calculations). Perhaps a better arc can be used.
a few more miliseconds shoved off with a suggestion from @schattenskorpion def solution12(n): str_len = n + 5 maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34) numberSet = ([False, True] * (maxInt//2) ) + [True] numberSet[0] = numberSet[1] = False numberSet[2] = True sqrt_maxInt = int(maxInt ** (1/2)) for i in range(4, maxInt, 2): numberSet[i] = False for number in range(3, sqrt_maxInt, 2): if numberSet[number]: for multiple in range(number**2, maxInt, number): numberSet[multiple] = False prime_str = ''.join(str(x) for x,y in enumerate(numberSet) if y) return prime_str[n:str_len]
@@b001 and using chatGPT's suggestion for optimized prime algorithm, I managed to compile this: def solution_chat(n): maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34) sqrt_maxInt = int(maxInt ** (1/2)) + 1 size = maxInt // 2 + 1 primes = [2] numberSet = [True] * size numberSet[0] = False for number in range(3, sqrt_maxInt, 2): if numberSet[number//2]: for multiple in range(number**2, maxInt, number*2): numberSet[multiple//2] = False primes.extend(x*2+1 for x,y in enumerate(numberSet) if y) return ''.join(map(str, primes))[n:n+5] the explanation from chatGPT was: "This version of the function uses the Sieve of Eratosthenes algorithm with some additional optimizations. It first eliminates even numbers, then checks the odd numbers in the range using an optimized version of the Sieve of Eratosthenes using Sundaram sieve and segmented sieve to only check the numbers in the range. This should result in improved performance." But the suggested function had some nasty bugs, whuch I had to deal with and selectively merge it with my code. This last version gives another ~30% improvement compared to my last suggestion. So, overall improvemnt from the video version - a bit more than 50%
I was sent the invite link to foobar a couple years ago (with my google screen splitting into a message saying "you're speaking our language"), procrastinated doing this first challenge, and my invitation was rescinded. Everytime I remember, I want to punch myself in the face.
Im a begginer in python, a kind of a hobbist to be honest. I paused the video at the beggining and tryied to solve for myself. Here is my solution: def is_prime(n): if n
I took a stab at this. Without saving any sort of state, the best I could come up with was ~30% improvement: edit: Didn't see this should be for Python 2.7.13. I used Python 3.10 def solution(n): string_length = n + 5 max_number = int(string_length * 2.028 + 475) number_set = dict.fromkeys(range(3, max_number, 2), True) exp = max_number**0.5 prime_string = "" for number in number_set.keys(): if number > exp: break elif number_set[number] is True: prime_string += str(number) for i in range(number**2, max_number, number): if i in number_set: number_set[i] = False return prime_string[n - 1 : string_length - 1]
I think 'if number * number > max_number' is faster than your current check, if I'd have to guess. I think in general, number * number is faster than number**2.
This solution took 0.006s to run. It uses an cached sieve that is regenerated when it become too small def gen_prime_str(max_int: int): primes = [True] * max_int primes[0] = primes[1] = False for i, state in enumerate(primes): if state and i str: nonlocal prime_str, current_max while n > len(prime_str): current_max *= 2 prime_str = gen_prime_str(current_max * 2) return func(prime_str, n) return wrapped @gen_prime_string def solution(prime_str, n: int) -> str: return prime_str[n:n + 5]
I keyed your script so I could compare. Yours took 9.617628 seconds and mine took 0.037388 seconds. Just FYI. Hope people find my solution useful or at least educational.
Could you not have the Sieve of Eratosthenes function just run once, save it as a variable and then run your 10_000 for loop inside the solution? I tried it and got a time of 0.002 seconds however I don't know if putting the test inside of the solution is cheating. Here's my code: def solution(): def calculate_prime_str(max_ID): maxInt = int((max_ID+5) * (5_000/2_465) + 475) numberSet = [True] * maxInt numberSet[0] = numberSet[1] = False for number, isPrime in enumerate(numberSet): if isPrime and number
I thought and did the same thing, but part of the problem is that it has to be "stateless" which means that you can't store previous states. Every time the solution function is run, it can't use any outside or previous information
I'm not sure if this works with older versions of Python, but I think this is faster (it was faster than the implementation in the video when I tested it on my shitty laptop, but my shitty laptop was slower than the time it took in the video). from time import perf_counter as pf def solution(n): # 1
I think you ONLY need to create ONE prime_str for the highest n ( in this case n= 10000 ) then each number in range(n) extracts the [n : n+5] digits from this prime_str here is the modified code of yours : def solution(n): str_len=n+5 maxInt=int(str_len*(5000/2465)+475) numberSet =[True]*maxInt numberSet[0]=numberSet[1]=False for number,isPrime in enumerate(numberSet): if isPrime and number
My solution got 0.002 seconds i.e. 2 milliseconds. The most obvious and the fastest solution is to pre-generate the string for the 10_000 case then get all the id s from there. def solution(n): prime_str = '235711131719232931374143475359616771737983899710110310710911312713113713914915115716316717317918119119319719921122322722923323924125125726326927127728128329330731131331733133734734935335936737337938338939740140941942143143343944344945746146346747948749149950350952152354154755756356957157758759359960160761361761963164164364765365966167367768369170170971972773373974375175776176977378779780981182182382782983985385785986387788188388790791191992993794194795396797197798399199710091013101910211031103310391049105110611063106910871091109310971103110911171123112911511153116311711181118711931201121312171223122912311237124912591277127912831289129112971301130313071319132113271361136713731381139914091423142714291433143914471451145314591471148114831487148914931499151115231531154315491553155915671571157915831597160116071609161316191621162716371657166316671669169316971699170917211723173317411747175317591777178317871789180118111823183118471861186718711873187718791889190119071913193119331949195119731979198719931997199920032011201720272029203920532063206920812083208720892099211121132129213121372141214321532161217922032207221322212237223922432251226722692273228122872293229723092311233323392341234723512357237123772381238323892393239924112417242324372441244724592467247324772503252125312539254325492551255725792591259326092617262126332647265726592663267126772683268726892693269927072711271327192729273127412749275327672777278927912797280128032819283328372843285128572861287928872897290329092917292729392953295729632969297129993001301130193023303730413049306130673079308330893109311931213137316331673169318131873191320332093217322132293251325332573259327132993301330733133319332333293331334333473359336133713373338933913407341334333449345734613463346734693491349935113517352735293533353935413547355735593571358135833593360736133617362336313637364336593671367336773691369737013709371937273733373937613767376937793793379738033821382338333847385138533863387738813889390739113917391939233929393139433947396739894001400340074013401940214027404940514057407340794091409340994111412741294133413941534157415941774201421142174219422942314241424342534259426142714273428342894297432743374339434943574363437343914397440944214423444144474451445744634481448344934507451345174519452345474549456145674583459145974603462146374639464346494651465746634673467946914703472147234729473347514759478347874789479347994801481348174831486148714877488949034909491949314933493749434951495749674969497349874993499950035009501150215023503950515059507750815087509951015107511351195147515351675171517951895197520952275231523352375261527352795281529753035309532353335347535153815387539353995407541354175419543154375441544354495471547754795483550155035507551955215527553155575563556955735581559156235639564156475651565356575659566956835689569357015711571757375741574357495779578357915801580758135821582758395843584958515857586158675869587958815897590359235927593959535981598760076011602960376043604760536067607360796089609161016113612161316133614361516163617361976199620362116217622162296247625762636269627162776287629963016311631763236329633763436353635963616367637363796389639764216427644964516469647364816491652165296547655165536563656965716577658165996607661966376653665966616673667966896691670167036709671967336737676167636779678167916793680368236827682968336841685768636869687168836899690769116917694769496959696169676971697769836991699770017013701970277039704370577069707971037109712171277129715171597177718771937207721172137219722972377243724772537283729773077309732173317333734973517369739374117417743374517457745974777481748774897499750775177523752975377541754775497559756175737577758375897591760376077621763976437649766976737681768776917699770377177723772777417753775777597789779378177823782978417853786778737877787978837901790779197927793379377949795179637993800980118017803980538059806980818087808980938101811181178123814781618167817181798191820982198221823182338237824382638269827382878291829382978311831783298353836383698377838783898419842384298431844384478461846785018513852185278537853985438563857385818597859986098623862786298641864786638669867786818689869386998707871387198731873787418747875387618779878388038807881988218831883788398849886188638867888788938923892989338941895189638969897189999001900790119013902990419043904990599067909191039109912791339137915191579161917391819187919992039209922192279239924192579277928192839293931193199323933793419343934993719377939193979403941394199421943194339437943994619463946794739479949194979511952195339539954795519587960196139619962396299631964396499661967796799689969797199721973397399743974997679769978197879791980398119817982998339839985198579859987198839887990199079923992999319941994999679973' solution = prime_str[n:n+5] return solution print(solution(3)) import time start = time.time() for i in range(10_000): solution(i) print(time.time() - start, 'seconds')
dont know if it's just because my computer is a potato, but my code ran for ~150s on my machine! i even cheated a little by using a lookup table! my code only stores 5 primes at once to ensure the resulting string is always atleast 5 characters long, instead of the whole prime string. here's my code: def gen_prime_id(num): strlen = 0 currprimes = [0, 0, 0, 0, 0] PRIMES = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97] for n in range(2,1000000): isPrime = True for d in PRIMES: if n >= d * d and n % d == 0: isPrime = False break if isPrime: del currprimes[0] currprimes.append(n)
your code is slow because you check each and every number if it is devisible by all the defined primes. It is also slow because you have a deletion at the begining of the array. If you insist on keeping this algorythm, you must limit the checks for primes up to the square root of the current number and use deque instead of an array. In addition, your answers are not correct beyond num = 4735, because you don't check if the values are devisible by primes larger than 97. You need the primes up to 147.
13 secons on a slow laptop and statless: def getID(n): M=(n+5)*5000//2465+456 sieb=[0]*M for i in range(2,int(M**.5)+1): if not sieb[i]: sieb[2*i:M:i]=[1]*len(range(2*i,M,i)) return "".join(str(i) for i in range(2,M) if not sieb[i])[n:n+5] print(len(primes),sum(primes)) import time t=time.time() for n in range(10000): getID(n) print(t-time.time())
Constraints:
Your code will run inside a Python 2.7.13 sandbox. All tests will be run by calling the solution() function.
Standard libraries are supported except for bz2, crypt, fcntl, mmap, pwd, pyexpat, select, signal, termios, thread, time, unicodedata, zipimport, zlib.
Input/output operations are not allowed.
Your solution must be under 32000 characters in length including new lines and and other non-printing characters.
Aren't you running it in a conda env with python 3.8.15?
Do NOT use java instead as it will execute the code in ~1.2 seconds for 10'000 calls of the function "solution", this would be too bad
Why not just generate a string of primes with length 10,005 using a shell script and hard code it into the solution, this would make it much much faster
@@nomzz1you couldnt use imports, so im assuming the code it-selskab also has to be purely Python
the fact you used camel case 1 line after using snake case made me more mad than it should've
I find videos like this to be extremely rewarding to watch. I really enjoyed the explanation of the ‘Sieve of Eratosthenes’. Really cool stuff!
for prime_str, i think if u use the built-in "filter" function instead of comprehension you would gain some speed... it's roughly 15% faster to use filter instead comprehension (15% for that statement only)
Here's my solution to "patch" on the prime string on at runtime. It's a bit hacky, but ideally a function like this would be used with a generator or a class object to remember state. There are improvements that can be made to make it scale since right now it just uses the hardcoded limit of 10_000
def solution(n):
if "prime_str" not in dir(solution):
max_int = 10_000
number_set = [True] * max_int
number_set[0] = number_set[1] = False
sieve_end = max_int ** 0.5
for number, is_prime in enumerate(number_set):
if number
My jaw literally dropped when I ran this. Amazing work!
@@b001 This is just adding state, which isn't allowed, no?
@@schattenskorpion That's where I think this is a bit hazy. It is completely self contained inside the function and doesn't pollute the global namespace, but it does set a "state" for the function itself. It would really depend on what their meaning of having a stateless function would be
Hi I'm new to python, could you please tell me what exactly you did??
@@NomanKhan-jf6pq Sure, so my implementation follows along almost perfectly with what b00l did, except that it treats the function as an object (since everything in Python is an object under the hood). So during runtime, what happens is the first time this function gets called, the member "solution.prime_str" doesn't exist (that's what the "prime_str" not in dir(solution) does). From here, it follows the same code as provided, and then attaches the prime_str to the solution function. Then, each time the function gets called, it can find the solution.prime_str member and index into that
Its nice to see algorithms at work, in this case the ‘Sieve of Eratosthenes’. Thanks for the video!
The code can be improved if you add in an elif which breaks the loop once number > (maxInt**(1/2)). Since there will be no other composite numbers in the list after this condition, the list will not change from this point on and there is no need to check the rest of it to the if statement condition.
Great observation. I see what you mean, instead of checking each number after (maxInt**(1/2)) against the if-statement, knowing each number would be False, just break out of the loop all together.
Yes, correct, it also helps if you put maxInt**(1/2) into variable before for loop so it doesn't calculate trought itteration
Why do you use number**2 as the start of the range in line 23? Doesn't this lead to missing multiples of that number. For example with 10 the start would be 100 up to maxInt so you would not find the multiples 20, 30, 40, 50, 60, 70, 80 and 90. Pls explain.
Each of those gets thrown out by 2. You can try this with any number, it’s multiples up to number**2, are already checked by previous numbers multiples. I realize I didn’t explain that part very well.
Another way you can think about this is that 5 * 2 is equivalent to 2 * 5, same with 3 * 5, 4 * 5, etc.
The first "new" composition will always be the number times itself, i.e. 5 * 5 or 5**2, so you will always throw out the compositions prior to reaching the squared version of the number.
Guys i am not even close to the level or solving google interview problems.
I was decently following along, but I wouldnt imagine a common person just implement this logic and solution this easy and fast right?
What i mean i need to do step by step, drawing board, trial and error and maybe than i can get somewhere close, and definitely not even in 30 min, please tell me i am not the only one.
These people have years and years of practice and likely know multiple languages. You just need to keep learning, it's all practice and experience.
Oh no, this is several hours of thinking all condensed into 10 short minutes! There was ALOT of drawing board, trial and error done in the background. :)
@@b001thanks for clarifying. I was questioning if I could ever do this in a minute.
Would defining the prime string right before the function, and then using it inside the function be considered an import?
he meant that it cannot save state. “no imports” is a weir way to word that though. so yes, it would.
@@chri-k thanks
It seems to me, you create a function to determine the string, then call that function in a function to determine the string just once initially. That would not be an import and you only create the string once and you can adapt the string length based on the amount needed with little more lines of code.
he meant that it cannot save state
@@chri-k To me that seems like really piss poor coding and something that is really beneath Google. At least create a single function that creates the string once and reuses it to calculate all the needed numbers. While that is sloppy coding combining two functions into one, it would work.
@@mind_of_a_darkhorse this is an interview question. They are trying to get you to make a time- efficient algorithm to generate the string at different values - and realize that you need to make it time-efficient by yourself. There is no other way to do this because it is not a practical or good thing to do in and of itself - as you said - and a more complicated scenario does not make for a good interview question.
or that’s what i think their reasoning is.
@@chri-k My thought is that on one hand, I get what you're saying, but overall the premise is rather a dumb way to achieve the goal of the intended outcome. I guess there is quite a bit more that I have to learn.
The function isn’t really collision free though, is it? There’s no way the input of 1 to 10,000 returns 10,000 (five digit) unique IDs, given that, in the concatenation, no prime begins or ends with leading zeros, the only way of arriving at something like 00001 is if it were nested in 100,001. The video shows, for the first 10,000 inputs (on the x axis) the search space maxes around primes under 20,000 (y axis), loosely demonstrating the function is not bijective. The commander should known this is not foolproof.
You don't need 4 leading zeros to have 10,000 unique ids though. It's possible to have 10,000 unique ids with 5 random consecutive digits of primes, because there's 100,000 combinations with 5 digits. However, within the first 10.000 digits, there's repeating substrings of up to 9 digits (113127131). So it's possible, but unlikely to get 10,000 unique 5 digit ids.
I think the index is based on the number of primes, not the number of characters, and if I'm not mistaken then you don't need to generate a string from all the numbers, just enough to acquire 5 characters. So I would add 5 to the sieve size and then take the number at the index plus the next 5 and stringify them, then get 5 characters.
using memoization to remember the already computed prime numbers and generate more prime numbers if needed, it dosent use any import or exports, its just saved in memory along with the function
"no imports and exports" was just a badly phrased way to say "the function isn't allowed to save any state between iterations. He explicitly mentioned this strategy and that it's not allowed in the video.
this task was flawed from the beginning. I thought "that can't create unique IDs!", then i tested it. There are 1131 non-unique IDs, e.g. 210 and 448 both have the id 94194
Yeah, I had that same thought, I never actually verified if all the IDs were unique.
@@b001 Yeah, it actually starts with 4... ID 4 collides with 655, 1245, 6572..
These are the most egregious offenders:
id=11131 4 collisions indices=[4, 655, 1245, 6572]
id=17331 4 collisions indices=[167, 2747, 6700, 7017]
id=33133 4 collisions indices=[169, 1313, 7003, 7044]
got invited to do their challenge a while ago, made it to stage 4 but failed one sadly ;(. now i know i had to use absorbing markov chains for ditrubited matrices but thats too advanced for me lol. goodluck to those who get it
Yeah the markov chain question was a bit crazy... I recognized it might involve markov chains, then managed to translate the maths from wikipedia into code. I guess that's what you're supposed to do??
I have Google Foobar unlocked but I'm extremely underqualified to take it and succeed. You can thankfully leave it for as long as you like before taking it. No clue when I'll eventually take it
I'm a beginner in programming, and found your video very useful. but I can't get the isPrime. it's a special function or something like a statement?
timing a function depends on the hardware it is run on, a fast computer will calculate this faster than a slow pc
I made it 4 times slower lol:
def isprime(n):
if n in {2, 3}:
return True
if n % 2 == 0 or n < 2:
return False
return all(n % i != 0 for i in range(3, int(n ** 0.5) + 1, 2))
def solution(n):
str_len = n + 5
integers = list(range(n, int(str_len * 5000 / 2465 + 476)))
prime_str = ''.join([str(x) for x in integers if isprime(x)])
return prime_str[n-1:str_len-1]
b001: 22.4 seconds
Greg: 83.9 seconds
Hey, first off, I've been totally addicted to your videos lately. Compared to so many other TH-camrs "tips" your's are actually helpful and fun to watch.
I know this a random question but may I ask what keyboard you are using or differently what switches it uses because I really like the sound of it :)
Again, great job on your videos
I got a new keyboard for Christmas which is the Keychron V2 Wired Custom Mechanical Keyboard Knob Version with Keychron K Pro Red Switches. My previous keyboard was the Skyloong SK61 with gateron red switches
def solution(x):
base = "2357111317192329..
return base[x,x+5]
1 to 1000 is not such a long string, just calculate it as pre work to the function
later insert that long string as is into the function
when the function is called just return by string index
nowhere in this question was it said pre work cant be done
this solution is O(1)
Isn't time based on computing power?
Great video! I really want to know what that theme is though, it looks really nice
SynthWave'84
I can think of two ways of making this run faster by only needing to compute prime_str once. The easiest way would be to save it as an attribute to the function solution(): you just need hasattr(), setattr() and getattr(). Second would be to have a function to compute prime_str, say compute_prime_str(), which computes the value once, then replaces itself in the global namespace with a new function that just returns that computed value. You can do this with globals()["compute_prime_str"] = lambda : prime_str.
A mutable default arg might work
could u pls explain me how did u come out with that formula of the graphic?
IDs are supposed to be unique.will this garantee that? Was that not the requirements.
Does anyone noticed it code yield incorrect result for all integers after 4715
People, please follow PEP8 guidelines. Variable names are supposed to all_be_in_snake_case and not camelCase or PascalCase.
Hey, what is your vs code theme
I got the foobar thing about 8 years ago! I was in my first year of college, near midterms though and only had time for 2 questions lol
How can you get access to this google foo bar?
Is there a reason why I can't generate the string of primes up to 10005 digits, paste that into a string inside the solution function, and then grab the needed digits via indexing? I feel like this would be the fastest method, but maybe I'm missing something...
"> Your solution must be under 32,000 characters in length including new lines and and other non-printing characters"
Welp, 10,005 digits fit with plenty of room to spare. This will be by far the best solution that follows the letter of the requirements.
I wonder if people who provide that as an answer would be rewarded or punished.
One basic ideas (not a Python expert, so the compiler might do this already):
Only loop over even indices once, then in the main loop iterate in steps of 2. Or better yet, start by constructing an alternating array of True, False, True...
( somewhat like number_set = [False, True] * (n//2) + [True] )
This isn't an asymptotic improvement anyways, so pretty irrelevant.
An asymptotic improvement could be achieved by using a better method of finding primes, like The Sieve of Atkin.
Edit:
Another idea/remark: Id prefer iteratively growing the prime sieve, counting the number of primes with d digits and adding that up. I think your heuristic formula approach is very risky, a good interviewer might ask you to proof that formula. How would you explain its correctness?
A heuristic implies knowledge about the distribution of prime numbers, which is one of the greatest unsolved problems in mathematics. If you can't prove your solution to be correct, don't you have a problem?
You are essentially giving an upper bound for this problem (several exist afaik), you'd have to explain how you got that.
Since n is also given an upper bound, this is pedantic and in this simple case the solution is probably fine.
Why not just hard code the prime string? Would look horrible but would be fast.
This seems like the wise solution
Can I have the name of the theme and font?
You can store the list of digits as an attribute of the function like solution._digits = …
Am I stupid or can he just create a global variable that holds the string, compute it once, then skip computing every time and just re-use the old string?
Theoretically you could also append to the string as needed if a value is input that is higher than the string already calculated. That way you don't have to calculate up to a length 10000 string if a high enough number isn't pulled. Again, a global to hold maxInt would allow you to check that condition and continue calculation.
He mentioned the function isn't allowed to keep state between runs.
@@1vader ah thanks. I missed that part.
would you ever make tutorials?
Does anyone know what VS code theme that is?
Looks like an Atom variant
@@juxyper Thanks!
It’s the SynthWave '84 theme
@@lama-yt Thank you so much!
What @lama-yt said
Honestly I'd have done something similar, but I would have ran separate code to generate the string for n = 10000 and just copy and pasted it into a variable lol. Its hacky but would absolutely cut compute time
great work!
here is my optimization over your solution that runs ~30% faster on my machine, compared to your algorythm, although I have to admit that the key point is finding the formula for the maximum prime required. Your formula needs a bit of an optimization on the lower end, but still, it is prety good one.
def solution4(n):
str_len = n + 5
maxInt = int(str_len * (5000/2465) + 475)
numberSet = [True] * maxInt
numberSet[0] = numberSet[1] = False
sqrt_maxInt = maxInt ** (1/2)
number = 2
while number number and _)
prime_str = ''.join(str(x) for x,y in enumerate(numberSet) if y)
return prime_str[n:str_len]
Nice job! Yep, I ran your solution and got 18.5 seconds on my machine. Well done!
I shaved off another ~5% by using a for loop instead of a while loop, which looked like:
for number in range(2, sqrt_maxInt):
if numberSet[number]:
for multiple in range(number**2, maxInt, number):
numberSet[multiple] = False
@@b001 if you want to be a bit more precise with the formula:
maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34)
your formula gives a mean value of the deviation from the required maxInt of 380
this one gives 221 overall and for the second portion (n>5000) it even gives 177
in addition to the bellow suggestion for the for loop, the end result is:
def solution7(n):
str_len = n + 5
maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34)
numberSet = [True] * maxInt
numberSet[0] = numberSet[1] = False
sqrt_maxInt = int(maxInt ** (1/2)) + 1
for number in range(2, sqrt_maxInt):
if numberSet[number]:
for multiple in range(number**2, maxInt, number):
numberSet[multiple] = False
prime_str = ''.join(str(x) for x,y in enumerate(numberSet) if y)
return prime_str[n:str_len]
and runtime drop from 25 seconds (original solution) to 18 seconds(solution4 from previous reply) to 17 seconds (solution7 current reply), on my current machine.
If you wonder where this formula comes from, I used the primes needed for n in (0, 4735, 10000) to draw a circle with center at (271533,-121780) and radius 297595.
Then I used the y axis of the arc between x=(0,10000) for the maxInt which I padded with 34 (to compensate the inaccuracies of my calculations). Perhaps a better arc can be used.
a few more miliseconds shoved off with a suggestion from @schattenskorpion
def solution12(n):
str_len = n + 5
maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34)
numberSet = ([False, True] * (maxInt//2) ) + [True]
numberSet[0] = numberSet[1] = False
numberSet[2] = True
sqrt_maxInt = int(maxInt ** (1/2))
for i in range(4, maxInt, 2):
numberSet[i] = False
for number in range(3, sqrt_maxInt, 2):
if numberSet[number]:
for multiple in range(number**2, maxInt, number):
numberSet[multiple] = False
prime_str = ''.join(str(x) for x,y in enumerate(numberSet) if y)
return prime_str[n:str_len]
@@b001 and using chatGPT's suggestion for optimized prime algorithm, I managed to compile this:
def solution_chat(n):
maxInt = int((297595**2 - (271533 - n)**2)**(1/2) - 121780 + 34)
sqrt_maxInt = int(maxInt ** (1/2)) + 1
size = maxInt // 2 + 1
primes = [2]
numberSet = [True] * size
numberSet[0] = False
for number in range(3, sqrt_maxInt, 2):
if numberSet[number//2]:
for multiple in range(number**2, maxInt, number*2):
numberSet[multiple//2] = False
primes.extend(x*2+1 for x,y in enumerate(numberSet) if y)
return ''.join(map(str, primes))[n:n+5]
the explanation from chatGPT was:
"This version of the function uses the Sieve of Eratosthenes algorithm with some additional optimizations. It first eliminates even numbers, then checks the odd numbers in the range using an optimized version of the Sieve of Eratosthenes using Sundaram sieve and segmented sieve to only check the numbers in the range. This should result in improved performance."
But the suggested function had some nasty bugs, whuch I had to deal with and selectively merge it with my code.
This last version gives another ~30% improvement compared to my last suggestion. So, overall improvemnt from the video version - a bit more than 50%
Whats the purpose of these contests now that we have AI?
I was sent the invite link to foobar a couple years ago (with my google screen splitting into a message saying "you're speaking our language"), procrastinated doing this first challenge, and my invitation was rescinded. Everytime I remember, I want to punch myself in the face.
Im a begginer in python, a kind of a hobbist to be honest. I paused the video at the beggining and tryied to solve for myself. Here is my solution:
def is_prime(n):
if n
I took a stab at this. Without saving any sort of state, the best I could come up with was ~30% improvement:
edit: Didn't see this should be for Python 2.7.13. I used Python 3.10
def solution(n):
string_length = n + 5
max_number = int(string_length * 2.028 + 475)
number_set = dict.fromkeys(range(3, max_number, 2), True)
exp = max_number**0.5
prime_string = ""
for number in number_set.keys():
if number > exp:
break
elif number_set[number] is True:
prime_string += str(number)
for i in range(number**2, max_number, number):
if i in number_set:
number_set[i] = False
return prime_string[n - 1 : string_length - 1]
I think 'if number * number > max_number' is faster than your current check, if I'd have to guess. I think in general, number * number is faster than number**2.
This solution took 0.006s to run.
It uses an cached sieve that is regenerated when it become too small
def gen_prime_str(max_int: int):
primes = [True] * max_int
primes[0] = primes[1] = False
for i, state in enumerate(primes):
if state and i str:
nonlocal prime_str, current_max
while n > len(prime_str):
current_max *= 2
prime_str = gen_prime_str(current_max * 2)
return func(prime_str, n)
return wrapped
@gen_prime_string
def solution(prime_str, n: int) -> str:
return prime_str[n:n + 5]
cool. your solution on my old rig takes 31 seconds. your solution on my new rig takes 14 seconds.
my solution on either runs in .. error.
This seems overcomplicated ngl there are much easier ways to do
I did it on my M1 device, it took me little under 10 seconds. Here is the output: 9.61848497390747 seconds
that’s gordon freeman
I keyed your script so I could compare. Yours took 9.617628 seconds and mine took 0.037388 seconds. Just FYI. Hope people find my solution useful or at least educational.
I wish I were smarter... I'm dumb but still drawn to coding, math but my brain can't comprehend this info or perhaps I'm missing something...
Lol this is way too advance for me currently 😜
i realized sqrt from math import is same as **(1/2) 😅. its a great video. but i arrived at result using map filter and lambda function..
Could you not have the Sieve of Eratosthenes function just run once, save it as a variable and then run your 10_000 for loop inside the solution? I tried it and got a time of 0.002 seconds however I don't know if putting the test inside of the solution is cheating. Here's my code:
def solution():
def calculate_prime_str(max_ID):
maxInt = int((max_ID+5) * (5_000/2_465) + 475)
numberSet = [True] * maxInt
numberSet[0] = numberSet[1] = False
for number, isPrime in enumerate(numberSet):
if isPrime and number
I thought and did the same thing, but part of the problem is that it has to be "stateless" which means that you can't store previous states. Every time the solution function is run, it can't use any outside or previous information
@@adicsbtw ah fairs if that’s the case then @d001’s is the best solution imo
I'm not sure if this works with older versions of Python, but I think this is faster (it was faster than the implementation in the video when I tested it on my shitty laptop, but my shitty laptop was slower than the time it took in the video).
from time import perf_counter as pf
def solution(n):
# 1
I think you ONLY need to create ONE prime_str for the highest n ( in this case n= 10000 ) then each number in range(n) extracts the [n : n+5] digits from this prime_str
here is the modified code of yours :
def solution(n):
str_len=n+5
maxInt=int(str_len*(5000/2465)+475)
numberSet =[True]*maxInt
numberSet[0]=numberSet[1]=False
for number,isPrime in enumerate(numberSet):
if isPrime and number
My solution got 0.002 seconds i.e. 2 milliseconds.
The most obvious and the fastest solution is to pre-generate the string for the 10_000 case then get all the id s from there.
def solution(n):
prime_str = '235711131719232931374143475359616771737983899710110310710911312713113713914915115716316717317918119119319719921122322722923323924125125726326927127728128329330731131331733133734734935335936737337938338939740140941942143143343944344945746146346747948749149950350952152354154755756356957157758759359960160761361761963164164364765365966167367768369170170971972773373974375175776176977378779780981182182382782983985385785986387788188388790791191992993794194795396797197798399199710091013101910211031103310391049105110611063106910871091109310971103110911171123112911511153116311711181118711931201121312171223122912311237124912591277127912831289129112971301130313071319132113271361136713731381139914091423142714291433143914471451145314591471148114831487148914931499151115231531154315491553155915671571157915831597160116071609161316191621162716371657166316671669169316971699170917211723173317411747175317591777178317871789180118111823183118471861186718711873187718791889190119071913193119331949195119731979198719931997199920032011201720272029203920532063206920812083208720892099211121132129213121372141214321532161217922032207221322212237223922432251226722692273228122872293229723092311233323392341234723512357237123772381238323892393239924112417242324372441244724592467247324772503252125312539254325492551255725792591259326092617262126332647265726592663267126772683268726892693269927072711271327192729273127412749275327672777278927912797280128032819283328372843285128572861287928872897290329092917292729392953295729632969297129993001301130193023303730413049306130673079308330893109311931213137316331673169318131873191320332093217322132293251325332573259327132993301330733133319332333293331334333473359336133713373338933913407341334333449345734613463346734693491349935113517352735293533353935413547355735593571358135833593360736133617362336313637364336593671367336773691369737013709371937273733373937613767376937793793379738033821382338333847385138533863387738813889390739113917391939233929393139433947396739894001400340074013401940214027404940514057407340794091409340994111412741294133413941534157415941774201421142174219422942314241424342534259426142714273428342894297432743374339434943574363437343914397440944214423444144474451445744634481448344934507451345174519452345474549456145674583459145974603462146374639464346494651465746634673467946914703472147234729473347514759478347874789479347994801481348174831486148714877488949034909491949314933493749434951495749674969497349874993499950035009501150215023503950515059507750815087509951015107511351195147515351675171517951895197520952275231523352375261527352795281529753035309532353335347535153815387539353995407541354175419543154375441544354495471547754795483550155035507551955215527553155575563556955735581559156235639564156475651565356575659566956835689569357015711571757375741574357495779578357915801580758135821582758395843584958515857586158675869587958815897590359235927593959535981598760076011602960376043604760536067607360796089609161016113612161316133614361516163617361976199620362116217622162296247625762636269627162776287629963016311631763236329633763436353635963616367637363796389639764216427644964516469647364816491652165296547655165536563656965716577658165996607661966376653665966616673667966896691670167036709671967336737676167636779678167916793680368236827682968336841685768636869687168836899690769116917694769496959696169676971697769836991699770017013701970277039704370577069707971037109712171277129715171597177718771937207721172137219722972377243724772537283729773077309732173317333734973517369739374117417743374517457745974777481748774897499750775177523752975377541754775497559756175737577758375897591760376077621763976437649766976737681768776917699770377177723772777417753775777597789779378177823782978417853786778737877787978837901790779197927793379377949795179637993800980118017803980538059806980818087808980938101811181178123814781618167817181798191820982198221823182338237824382638269827382878291829382978311831783298353836383698377838783898419842384298431844384478461846785018513852185278537853985438563857385818597859986098623862786298641864786638669867786818689869386998707871387198731873787418747875387618779878388038807881988218831883788398849886188638867888788938923892989338941895189638969897189999001900790119013902990419043904990599067909191039109912791339137915191579161917391819187919992039209922192279239924192579277928192839293931193199323933793419343934993719377939193979403941394199421943194339437943994619463946794739479949194979511952195339539954795519587960196139619962396299631964396499661967796799689969797199721973397399743974997679769978197879791980398119817982998339839985198579859987198839887990199079923992999319941994999679973'
solution = prime_str[n:n+5]
return solution
print(solution(3))
import time
start = time.time()
for i in range(10_000):
solution(i)
print(time.time() - start, 'seconds')
dont know if it's just because my computer is a potato, but my code ran for ~150s on my machine! i even cheated a little by using a lookup table!
my code only stores 5 primes at once to ensure the resulting string is always atleast 5 characters long, instead of the whole prime string.
here's my code:
def gen_prime_id(num):
strlen = 0
currprimes = [0, 0, 0, 0, 0]
PRIMES = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
for n in range(2,1000000):
isPrime = True
for d in PRIMES:
if n >= d * d and n % d == 0:
isPrime = False
break
if isPrime:
del currprimes[0]
currprimes.append(n)
strlen += len(str(n))
if strlen > (num + 5):
pstr = ''.join(map(str, currprimes))
start = num - strlen + len(pstr)
return pstr[start:start+5]
your code is slow because you check each and every number if it is devisible by all the defined primes. It is also slow because you have a deletion at the begining of the array. If you insist on keeping this algorythm, you must limit the checks for primes up to the square root of the current number and use deque instead of an array. In addition, your answers are not correct beyond num = 4735, because you don't check if the values are devisible by primes larger than 97. You need the primes up to 147.
13 secons on a slow laptop and statless:
def getID(n):
M=(n+5)*5000//2465+456
sieb=[0]*M
for i in range(2,int(M**.5)+1):
if not sieb[i]:
sieb[2*i:M:i]=[1]*len(range(2*i,M,i))
return "".join(str(i) for i in range(2,M) if not sieb[i])[n:n+5]
print(len(primes),sum(primes))
import time
t=time.time()
for n in range(10000):
getID(n)
print(t-time.time())
Agree with this. You can save a little time by just checking odd values>3 (see my solution somewhere in this thread).