I discovered this video a while back and returned to it today because I'm diving into Procedural World Generation for a game I'm developing. I really wanted to create something unique, rather than just follow the most popular tutorials out there. Your explanation was fantastic and inspired me to think deeply about the possibilities when you truly invest effort into your work. I haven't come across much content on Constraint Satisfaction in the context of Procedural Generation-perhaps because it can be a bit slow-but your video really opened my eyes. Thank you for sharing such valuable insights!
I’m thrilled to hear it was so useful. Constraint satisfaction has been around for a long time, but I haven’t seen applied to procedural generation much either. I’m not sure why, maybe it’s just viewed as too old to be applicable (?)
@@programmingchaos8957 probably, it also took some time to generate and most of the procedural generation these days just uses noise and calls it a day. There has been a small rise in popularity of wave function collapse for procedural generation, who knows maybe this will also pop out.
I really love how you show every step of the way, including mistakes and fixing bugs. You explain everything in a way that I, a novice programmer who finds coding very daunting at times, both understand it and also am comforted to see the code being not as complicated as it may seem at the beginning or as I feel it would be by looking at the result it gives.
Thank you! Part of the reason for showing (some of) the bugs is because that's how real programs are written. It's a bit demoralizing (at least for me) to get the impression that there are people out there who can code long sections of code without a single mistake - not that I think they really exist, but it seems like it watching some tutorials. Part of the reason is Bob Ross' 'happy little accidents' - occasionally a mistake leads to more interesting result - at least with procedural generation. And finally I do make mistakes, if I rerecorded every time I made one I'd never get anything published :) I am really glad that this is helpful. There are a lot of programming concepts, maybe all of them, that seem impossibly complicated until they 'click' and then you have trouble remembering why they seemed complicated in the first place. As a wise person from my kids' generation said, "sucking at something is the first step to being sorta good at something".
Holy cow it’s my old professor, always loved your work, glad TH-cam recommended this to me (Kyle Morgan, you wouldn’t remember me lol but I was a decent student, software engineer now)
Hi! It's great to hear the degree is serving you well. I hope you're enjoying the work. As you can probably tell I'm still enjoying teaching :) And thought I should help a larger audience.
@@programmingchaos8957 You’re a great teacher and professor. Your research has always had my attention. I’m treating SE as a career, which has served me well, but I miss the science side of CS. I have a library of rigorous math, CS, and physics books and I’m trying to figure out how to be valuable in those fields. Your work is endlessly inspiring and accessible to so many. Thanks for responding, I’ll be watching all your videos on this channel this weekend. Thanks again for not only being a great teacher but also sharing your fascinating research and creative projects with students at every level, cheers!
You're very welcome! I have to say that one of the huge advantages of being a professor (besides getting to teach) is the flexibility to try the random projects that seem interesting.
The ability to add initial values like you did with the island seems like the coolest feature here. I could imagine drawing in a rough landscape idea with an image editor and then handing those initial conditions to the constraint satisfier to turn it into an actual map
Thanks! I agree that that makes it much more interesting. I think its sometimes referred to as a falloff map if you want to learn more about how it can me done.
This could be used to generate test cases for obscure scenarios. One time I used constraint programming to generate all possible cases (maximum 100s different cases) with a bunch of weird rules given by business analysts. It proved that they didn't even knew what they were asking. It simplified the rules in the end, because they were cancelling each other in a few cases.
Very cool application! It is surprising how many problems can be reformulated as constraint satisfaction and then there's well defined solution approaches.
@@programmingchaos8957 Indeed. I believe that they sound too formal to be used out there. That's why we don't see more of it. In more important projects or fields, I believe this method is treated with more importance. I wish I could work with it.
Didn't absorb it all as I have many many things I still need to learn, but the way you explain makes it so much easier and fascinating, you made me feel like I was consuming something that should be paid. Great video! Also, just a rather silly thing. You might want to flip your webcam next time, so when you gesticulate to the right/left, it'll be the right/left meant to be shown.
Thanks! I'm really glad you found it useful. Good point about flipping the camera. I should have noticed that during editing. Thanks for the suggestion.
Very nice video! I do like this concept a lot as I've been interested in procedural pattern generation, but it seems cumbersome to have to set all the values myself. WFC does not require this, but it is not as flexible as the approach you showed. This has given me some ideas on how to improve some of my own projects. I hope to see more from you!
Thank you! I think this approach is quite a bit simpler than WFC. The main place WFC will work better is if there are relatively few solutions available, then it's more systematic approach is more likely to find them. I plan to keep producing, if only because I enjoy it. I do have a fair number of other videos, many on procedural generation, but more specific techniques.
Awesome and inspiring, thank you for posting that. I appreciate that you left in your errors but added "Yes, I know that's incorrect" call-outs so that those of us yelling "You made a mistake!" can relax. I've built a web-based 2D animation framework for browsers with Erlang, JS and websockets, and this gives me an idea of what to use it for.
Thank you! I leave the errors in (and point them out) for a couple of reasons. I think it's a better representation of 'real' programming where we make mistakes, types, etc. I'm afraid that showing perfect coding is a little demoralizing, especially for newer programmers. And this way I don't have to rerecord over and over until its perfect :) I'm glad you appreciate the pointers to the errors. If you add this to your frame work, feel free to post another comment, I'd love to see it running.
Thanks! There's a reason for that :) Because this is my channel and not a University one, I'm not sure if I should mention the name, but the banner in the background makes it easy to figure out if you're interested.
Sounds great! I struggled a bit with the 3D version because there are nine cells 'under' each cell, so the constraints had to be changed a bit to keep it from just filling with green.
29:18 That's very nice! It's doing an exceptionally good job at making the circle not too perfect looking. Though, I can imagine that it would be obnoxious if the deep water was a kill zone because then the player wouldn't have a precise measurement of how far off from land they can go before getting to deep water.
The deep water issue is an interesting one. How much the shallow water (or any other terrain) spreads depends a lot on how many tries are made to find a terrain the meets all constraints. Few tries and the terrain 'wanders' more. In the deep water case it would probably make sense to do a post pass that smooths out that boundary so it's obvious.
this is a neat technique. interesting that the search window functions to limit the spatial variance frequency. although i still think it’s generally much better to do multiple passes at lower resolution to bias it, or allow for more complex output from simple rules (like height, temp, and moisture as simple low res runs in that order, each depending on the previous, possibly with simpler rules or techniques, then upscaled, and then used to bias something to pick the final state. simple rules for each but much more complex output).
Thanks for the feedback. I tend to agree that there are a lot of things that could be done to make the terrain more interesting/realistic. I particularly like the idea of applying the algorithm at multiple scales. The purpose of the is video of course was just to be a relatively quick introduction to the approach, and a brief discussion of other application areas.
Rather than randomly generating terrain types in leastConflicts, you can keep a set/arrayList of the types with the least conflicts. Then choose randomly from that list. This avoids the "fill everything with mountains" problem and also does not skip types or generate them twice.
Definitely! I was thinking my approach was a bit easier to code, and was trying to keep things simple. But removing items in a random order from and ArrayList would be pretty simple. Languages with a built in shuffle function would also work well of course.
This is an awesome video! I love the idea of it generating in real time, I could imagine implementing painting-systems where you can live edit the world and paint in your own details.
Thank you! And that's a great idea! Painting in the rough map to start with, 'I want a mountain range here, a deep lake there, a massive forest in the North' and then letting the algorithm fill in the details, and then being able to go in and re-edit, clean up shorelines, merge mountain ranges, add mountain passes, etc. would be fantastic.
How my uneducated 32 year old self taught not-quite 'programmer' thought of this before watching your video: 'bucket algorithms' and 'water algorithms'. Haha ignorance is truly bliss but thank you for the video! I'm trying to figure out how to do this with mpi in the bash terminal.
Right! But this is the drawback to local search, because it's fairly random there's no guarantee that it will find a solution even if one exists (technically its an 'incomplete' search). So, the best you can for termination is stop after a lot of tries. The assumption is that if you are doing procedural generation there should be a lot of possible solutions and so the algorithm will find one. If there's only one (or possibly no) solutions it's better to use a complete search algorithm, but they can be harder to code and a lot slower to run.
Thanks! I find it can be a bit demoralizing when tutorials make it seem like there are people who can program huge stretches of code completely error free. So, I don't mind showing some of my errors. Of course, it also means I don't have to rerecord those sections, which saves a lot of time :)
@@programmingchaos8957 It's not just demoralizing, but also transports a completely wrong picture of what programming actually is: An iterative process to find a solution for a given problem. Coding errors are an integral part of that. The few times I arrive at a solution without any errors leave me utterly confused :D When I teach programming to newbies, I like to show them test driven development. It seems to help them understand the iterative process better and to not get frustrated by failures.
Putting it in the framework of test driven development is a great idea! I teach some test driven development, but for newbies I fear I have a bad habit of planning the algorithm on the board (which is usually simple because I've taught it half a dozen times before) and then just implementing it. At least when I'm teaching I do show the accidental typos and minor errors along the way. But don't do as good a job with the larger, logical errors. Thanks!
Thanks for another fun and interesting video! I find the wave function collapse stuff to be difficult to follow. But your code seems a to make neat things and is much easier to digest.
Thanks! I originally planned to do a video on Wave Function Collapse, but the more time I spent on it the more I felt that for procedural generation minimum conflicts was a lot simpler to code and would do just as well. As far as I can tell Wave Function Collapse is the same as forward checking for constraint satisfaction, so if you're struggling to understand wave function collapse you could like at descriptions of forward checking for constrain satisfaction.
Youre calling leastConflicts every frame. That would make it slower (check every frame). You can fix it by not checking again once it returns success. (Or a count a few times). But its also cool because you can interactively modify the landscape and let it adapt (eg sandbox destruction). Its cool in your checkconflict you could also use programatic rules. Youve used an map wihch is kinda like WFC, but you could have used rules like 'only these many mountain pixels', or 'this much water'. Or like you did with the plants and berry. Amazing!!
Again you're absolutely right! Instead of doing N checks in leastConflicts and then trying again, leastConflicts could loop until there are no more conflicts before returning. But I liked being able to see how the terrain was changing. And it's good for debugging if something goes wrong. I do like the idea of an extra step in the draw loop where after leastConflicts is called once, it pauses and the user can go in and modify the current map and then rerun leastConflicts. The map at the very end does something a bit like what you are suggesting with the cities by slightly limiting the number of them. But in general the terrain rules I'm using are fairly simple, mostly to keep the video short and to give everyone something to build on. If you do try out more complicated rules I'd love to hear what the results are.
20:35 a technique i use often in vgames is shuffling an array of indexes and walk those. Warranty you walk visit all the cells in random order only once. Or you can shuffle in place with random and swap. (Kinda like bubble sort)
Absolutely! I didn't use shuffle because not all languages have it and I didn't want to extend the video by coding it by hand. But it's the perfect solution here.
This is a great showcase of how the process of procedural generation can go, and it's very well explained, thank you :) I do want to note, around 22:30 (dx + x + worldWidth) % worldWidth = (dx % worldWidth + x % worldWidth + worldWidth % worldWidth) % worldWidth = (dx % worldWidth + x % worldWidth + 0) % worldWidth = ( dx + x ) % worldWidth So it can be shortened :) Unless P5js has a modulo operator that works differently from normal math that is
Thank you! Making sure the explanations are clear is one of my main goals, so I'm glad that's working. Keep in mind that dx can be negative (it starts as negative of the range), if x is small (e.g. less than range) then dx + x is negative and, at least in Processing/Java % of a negative is that negative value. E.g. -2 % worldWidth = -2, which will cause an array out of bounds error. So, the +worldWidth is to wrap the negative cases.
@@programmingchaos8957 Ah i understand, thank you The actual Modulo operator in maths is defined differently then, and i'm honestly not sure why it's done this way in javascript...
@@zitronenwasser Me neither. It's why I've gotten in the habit of testing the behavior of functions to make sure they act they way I expect. It's rare, but there's nothing worse than spending hours trying to fix a bug that caused by unexpected behaviors rather than a mistake.
Interesting. Is it possible for conflict resolution to create infinite loop? Like for example, you fix one set of conflicts and after few iterations you get back at initial set of conflicts (like those cellular automata that regenerate themself)
I'm not sure. I think it's very unlikely for this algorithm to get stuck in a literal loop because it's making random selections to resolve conflicts. So, to get stuck in a loop it would have to be a condition where there are only a few options that it cycles through. On the other hand its very possible for it to search endlessly through different terrain assignments without every finding a conflict free solution - it's just unlikely to loop. There are other systematic conflict resolution algorithms that never get stuck and can say whether or not a solution exists, but they are much slower and because they are systematic, don't lead to interesting terrains. They are useful for problems like sudoku, where there are very few solutions, or if you want to know if any solution exists.
Sounds similar to wave function collapse. But the process seems a bit different. Just heard you mentioning WFC, i wish you explained the diff a bit more. Nice vid. Unreal has a WFC plugin. But also this video sunds a bit like PCG.
You're right that this is similar to Wave Function Collapse (WFC), but WFC is a bit different. WFC starts with every variable in a 'superposition' of possible states/values. They you pick a variable (cell) to assign a value to, adjusting the allowed values for all 'surrounding' variables (i.e. any variables that share a constraint with the variable you just assigned. For example, if you place forest, every cell around it that doesn't already have a value is limited to mountains, forest, or plains). You repeat this process until all variables have a valid value. All of which is very similar between the two algorithms. The big difference is that with WFC if you get to a variable that has no allowed values (e.g. because water was assigned one away from mountains and there's nothing valid to put in the middle) the algorithms backtracks, undoing the last few value assignments and tries again, or gives up and starts over. Whereas with minimum conflicts we accept assignments that violate some constraints and trust that they'll be fixed later. Minimum conflicts is easier to code, and - if there are lots of possible solutions - faster. WFC is more likely to find a solution if they are rarer. Honestly, as far as I can tell WFC is identical to a depth first constraint satisfaction search using forward checking. If you do use WFC there are some heuristics that can make it much more efficient: 1. Assign values to the variables with the fewest remaining values first. E.g. if cell 1 can be any terrain, and cell 2 can only be mountains or forest, then assign a value to cell 2 first - you're much more likely to reach a solution or a dead end faster and save search time. 2. Assign values to the variables involved in the most constraints first. E.g. a cell with lots of neighbors, rather than only a few. 3. Assign values that give the most flexibility to 'neighboring' variables. In the terrain example it means assign a 'middle' terrain like plains or forest, before an edge terrain like high mountains or deep water. This will lead to solutions faster, but maybe less interesting terrain.
Question for ya when you have a chance, what’s the difference between this and the wave function collapse? I’ve seen that method used for sudoku as well
This minimum conflicts algorithm is willing to assign values that violate constraints with the hope of fixing it on a later pass. My understanding is that WFC won't assign a value that creates conflicts. If it reaches a variable that has no valid assignments it backtracks and undoes recent variable assignments and tries different values for those variables hoping to avoid the conflict (or it starts over). This makes WFC much better at solving problems with very few solutions (like sudoku) and it potentially can report that a problem is unsolvable (there is no valid set of assignments). However, the backtracking is trickier to code and it's potentially much slower. So, if you know there's a lot of solutions available, which is typical for procedural generation, this minimum conflicts algorithm is simpler and faster. But if there are relatively few solutions available WFC or other exhaustive search algorithms may be necessary to find one of the solutions. As far as I can tell WFC is very similar (identical?) to the forward checking algorithm for constraint satisfaction.
Hi. Thanks for the video, it's very clear and easy to follow, despite you using C. I've been wondering about something for a really long time, and I'm wondering if you have any insights. Suppose I have something like a global constraint. What I mean is: in your example concerning forest, beach and ocean, to check if the contraint is satisfied - that forest can't touch the ocean - all you need to do is check a local kernel around each cell to check satisfaction. I'll call this a _local constraint,_ to contrast a _global constraint,_ whereby I have to check the whole array to determine if the constraint is satisfied (maybe in principle). One global constraint that I've been wondering about for example is say, "group-sizes must be distributed according to a power law" - look up something like mathematical percolation or the Ising model at critical temperature to get an idea of what this looks like. I think this global constraint tends to be satisfied by terrain generation in nature, so I was wondering: is there any sort of computational efficiencies we can exploit for global constraints like this; maybe even global constraints of a stochastic nature like this one, where the degrees of freedom aren't really clear until we see _a_ solution? And, could those efficiencies scale to something that might run in some practical time at the size of say Minecraft?
I'm glad you enjoyed it. Making the coding clear is one of my main goals so I'm glad to hear that you found it clear. The language is Java, not C, but at this level they are almost identical. The major difference is the arrays are declared as: int[][] map; // Java not int map[][] // C Global constraints are a great idea. I think the answer depends in part on the type of constraint. For example, if the constraint was no more than 20% forest then it would be easy to update a global counter as part of the constraint satisfaction algorithm. Every time you added a forest, add one to the counter, every time you removed a forest subtract one. But that's the simple case. For your group sizes constraint I think the only solution would be a separate pass over the whole map calculating the required statistics. But for a game this would only need to be done once at the beginning, so even if it took a while (tens of seconds) it could be hid behind a loading screen, tutorial, character selection screen, etc. Another interesting question is does this (at least for the power law) happen automatically with this algorithm? I could see where very large sets of terrain are increasingly unlikely. Or is there a way to seed the initial map to get a desired distribution? E.g. by placing random circles of forest that follow a power law.
@@programmingchaos8957 For the afforementioned algo.s, it's an emergent feature. Both of them are essentially cellular automata that update based on a local kernel passing over the grid element-wise with random fluctuations in cells, which would probably be more efficient than passing over the whole thing, checking the statistics and then making an adjustment and checking to see if the statistics are closer, etc. (plus I imagine that'd look more unatural), but scaling these things up is still a nightmare. I tried a while ago getting them to a good enough looking resolution, and it got to taking tens of minutes in Python (albeit using a less-efficient algorithm for it; but I kinda had to since I wanted to also track the correlation-length over time). The thing about the power law distribution is that it's scale-free, so you can zoom into any part of the distribution and it should be similar. This is really the property I'm interested in, because _in principle,_ if you have local information around some region in your grid, then that should also give you global information - but it only givees you the information about the distribution, not about how that distribution is realised. I just thought that it would be nice if I could exploit that fact to allow it to scale faster.
@@programmingchaos8957 Sorry, I typed a response to you, but it disappeared. The aforementioned algorithms are essentially cellular automata that check some local kernel around a cell and transform it depending on that calculated value. The power-law distribution is emergent from these rules. As they are, these algorithms scale pretty poorly both in space and (particularly of interest) time. There are more efficient algorithms that one can use (say for the Ising model), but you end up sacrificing some other things. It might be fine for the purposes of terrain generation, but I wanted to track the correlation-length of the whole system in real-time, which it looks like only the Metropolis algorithm was capable of doing. Nonetheless, I'm still unconvinced about the traditional approaches. I tried scaling these guys up before, and they took me tens of minutes on my machine with Python (I couldn't quite get what I wanted out of Numba; it threw a fit at me when trying to parse multidimensional arrays, so I shelved it. I know some C now, so I might revisit it in Cython). The thing about power-law distributions, is that they're self-similar at different scales - if you zoom in or out of the distribution, you end up seeing the selfsame distribution. Think: the Pareto 80/20 rule in economics: 80% of the wealth is in 20% of the population; 80% of _that_ wealth is in 20% of that population, etc. etc. Similarly, with these phenomena, if you zoom in or out of these grid-arrays, the group-size distribution is the same. This property is what's known as being "scale-free". This means that _in principle,_ we should have information about the global distribution of group-sizes from local information in some region. I'd hoped I could use that fact to kind of 'compress' the grid in some way, maybe to speed things up. I would say that with these algorithms, seeding can be difficult. What tends to occur is that the major groups end up moving around in non-trivial ways, making them unrecognisable from the initial seed. You can probably obtain utility out of seeding if you don't run them for a very long time, but you want to run these things for a very long time to obtain power-law results.
you could also check systematically for conflicts without changing anything, keeping one at random to change later. So you'll be sure when it returns false that there are no conflicts.
Definitely. It would be almost the same as the leastConflicts() function, but would go through the map systematically and just return whether it found any conflicts or not.
It could be interesting to try a hierarchically fractal approaches, similar to how cellular automata caves are sometimes run over several scales -- with the higher resolution passes have constraints both for neighbors and parent layer (coarser scale).
That's a great idea! I can already imagine forest tiles that you can zoom in on and see different types of forest, depending on the neighboring cells, conifers next to the mountains, deciduous next to the plains, and complex mixing in between.
Decent video, love the mod thingie for wraparound. But I must say, this seems like very special case of cellular automata, basically where cell update rule is constraint checking, not that it's bad or anything :) Hm actually even wave function collapse can probably be done as CA, might have to look at that.
I was thinking the exact same thing! A lot of the output, for example the plants, look a lot like the output of CAs. One difference is that typical CA rules only allow one output. For example, if we started with a predefined row of random terrains and filled in the rest of the map using CA rules starting at that row it would normally require rules that defined exactly which terrain went in each cell based on the previous row. But randomized rules should work. It wouldn't allow the map to wrap properly though.
@@programmingchaos8957IMO you are selling CA short :). I'm using CA for map generation for game I'm working on. I'm not sure what you mean with "not allow to wrap properly" I have no problem having it implemented, on hex. And yes you can have non deterministic rules there and all kinds of jazz. I'd say "one output" is also sorta underselling it. (and also here I do not see anything more than one output ether :) First output what that is visible can differ from what is stored (the state), and the state can be arbitrary complex. I.e. in my case my state is multiuple "layers" of ints, so just array :), each with different meanings, mainly used during different iterations. So you could easily have "superposition representing states" too. And also display however you want. I.e. in my case initial runs are just numbers for which territory cell belongs, and I expand them using nondeterministic rules. You can then assign some of them be land (or it can be done before :) ), and just draw map based on that. But then I can run different rules concentrating on height layer, taking into account lands and existing heights, etc, etc. So in all it very flexible, that's why for this I see little difference. Still that generation setup/rule with constraints, is intersting, exp cool to see demonstration on what results look like! Might even experiment with something like it, but I'd probably tweak them a bit with regards to distance (i.e. use euclidean maybe, or some weird randomization for a bit more variety).
You're right, I was too focused on the standard CAs with relatively simple, deterministic rules. Do you have any kind of a development blog? I'd love to check it out.
14:49 There are a few other ways you could have done this, first of all, you could shuffle a list of every tile so as to prevent looking at the same tile twice, secondly you could just store the old map and the next iteration of it so that you can systematically go through the old map referencing the old map but making all the changes to the new map which should be more efficient. But it should still work fine either way!
A shuffle would definitely work well. I didn't use one only because there isn't a built in shuffle function (at least that I know of) and I didn't want to take the time in the video to code it as well. For the old map, new map, wouldn't that also require recopying the old map back into the new map for next cycle? Or, much more efficient, think of them as map one and map two and just have a flag that says which is being updated on each cycle. That should work well.
6:33 For this particular system you could probably generate it more quickly by starting with a 2*2 grid then doubling the resolution and introducing some chaos, repeating this enough times will get the desired pixel resolution without having to generate all the pixels straight up.
Interesting idea. I'm not convinced it would be quicker, but even if it's not, this would be a great way to introduce larger land forms. E.g. mountain ranges that extend across a whole continent, inland seas, etc.
@@programmingchaos8957 While, the reason you could get it quicker is because you can store data from the previous iteration to help speed up the current one, like the edges between beach, land, and sea, those are places where there are likely conflicts so knowing where they are can limit your conflict resolution function to just those spots. Though you may want to have an additional sprinkling of a few random checks to make sure things aren't too uniform.
Wave function collapse (WFC) is a bit different. Wave function collapse starts with every variable in a 'superposition' of possible states/values. They you pick a variable (cell) to assign a value to, adjusting the allowed values for all 'surrounding' variables (i.e. any variables that share a constraint with the variable you just assigned). You repeat this process until all variables have a valid value. All of which is very similar between the two algorithms. The big difference is that with WFC if you get to a variable that has no allowed values you backtrack undoing the last few value assignments and try again, or give up and start over. Whereas with minimum conflicts you accept assignments that violate some constraints and trust that they'll be fixed later. Minimum conflicts is easier to code, and - if there are lots of possible solutions - faster. WFC is more likely to find a solution if they are rarer. Honestly, as far as I can tell WFC is identical to a depth first constraint satisfaction search using forward checking. If you do use WFC there are some heuristics that can make it much more efficient: 1. Assign values to the variables with the fewest remaining values first. E.g. if cell 1 can be any terrain, and cell 2 can only be mountains or forest, then assign a value to cell 2 first - you're much more likely to reach a solution or a dead end faster and save search time. 2. Assign values to the variables involved in the most constraints first. E.g. a cell with lots of neighbors, rather than only a few. 3. Assign values that give the most flexibility to 'neighboring' variables. In the terrain example it means assign a 'middle' terrain like plains or forest, before an edge terrain like high mountains or deep water. This will lead to solutions faster, but maybe less interesting terrain.
Yes, and works a lot better than the minimum conflicts approach shown here, which I suspect would flail around for a long time without finding a solution. Minimum conflicts generally only works well when there are lots of solutions, but then it can be a lot faster than WFC.
3:18 Hm, one potentially effective additional constraint to make this generate better looking land is to make the grey parts in-between land and water require that there can't be a 2*2 square of only grey tiles, this will hopefully prevent too small puddles or chunks of land.
Nice idea! For the video I was intentionally keeping the number of constraints low to reduce the total programming time. But I want to try this out and see how it looks.
I am going to use a similar algorithm to simulate stuff by generating a map with one dimension: time. Pixels are going to be replaced by states and there is a constraing violation if state[t] can't evolve into state[t + 1]. Also I am going to add time travel to this system.
When to write the unit test for CheckConstrains? When you first use it, you have in mind what you expect so a perfect time to write the test. If you don't want to interrupt your flow, you delay the test case until you have written out all the use of the results. I'm looking forward to having a reliable AI write the test at the same time as generating the current stub function: not yet implemented...
Writing the tests for the function is a great application area for an AI. I'm a little skeptical of the claims that it will take over programming. LLMs are great at problems that have been solved hundreds of times already, writing a resume, legal brief, or a quick sort algorithm, which makes them seem really impressive. But most programming projects should include some new aspects, requiring new solutions, which LMMs are not so good at. But generating tests for human written code seems like a space they would perform well in.
Interesting how you can easily "filter" randomness to end up with some order, Stable Diffusion works in a very similar way, the only difference is that instead of using some simple logic it uses a UNet to predict the noise. Also, (not that I have anything against but) you should consider using something like shadertoy, GPUs are much much better at munching tasks that can parallelized, but for learning yea, using that makes it way easier to understand what is going on exactly.
I hadn't thought about the relationship to Stable Diffusion, there is an interesting similarity. And yes a shader would definitely be the way to do this if you wanted it to be super quick or had a much bigger map. As you noted, my focus is really on teaching the algorithm. But it does make me think that I should put together a video or two on basic shader programming and its applications. Thanks for the suggestion.
ohhhhh I'm having an idea! How about a game where you play through a map which is generating itself in real time? I wonder if that concept would be viable.... If I had more free time I would start on that tomorrow morning. :)
So, the map isn't decided at all until the player moves into the new areas? That would be interesting, especially if the player's actions influence what appeared next. So, the terrain would have to follow the rules, but if a given cell could be either plans or forest which one it became depended (in part) on the player. For example, if they traveled from forest it was more likely to continue as forest, or they had some ability to influence the probabilities. If you make it Ill play it! :)
@@programmingchaos8957 I didnt' think that far ahead :) I had an idea of the maps generating in a 2D platformer style. Having some gameplay elements advance the generation sounds more viable. Making a game out of that would require some serious deep thinking....
The "Perfect CS" algorithm would look something like this: ``` generate_random_map(); // each pixel is random value while (!is_map_valid()) { // check if map has any constraint violations generate_random_map(); // try again } ``` Of course this would be very inefficient and have exponential time complexity. So what other algorithms produce the same output (with same probabilities) as "Perfect CS"? Is algorithm shown in this video "perfect"? Is WFC "perfect"?
I'm not sure about perfect, but one formal term is 'complete'. A complete algorithm is guaranteed to find a solution if one exists and return false if no solution is possible. The minimum conflicts algorithm I presented is definitely not complete. It is only find likely to find a solution if lots of solutions exist - which should be the case for procedural generation. WFC can be complete if every time it reaches a variable the no longer has a valid assignment (e.g. a cell between forest and deep water) it backtracks, undoing previous assignments and trying new ones until it solves the conflict. But that type of exhaustive search tends to be a lot slower. With regards to probabilities I suspect, but don't know for sure, that both minimum conflicts and WFC collapse create terrain that's 'clumpier' than your algorithm would produce. Because the easier to find solutions have large clumps of the same terrains. In fact, if a map problem is hard (for example if the range where the cells can't conflict is large) the minimum conflicts tends to 'give up' and just push everything towards a middle terrain like plains.
I want to live in the universe where all of my code compiles and runs correctly the first time. But I suspect that even in an infinite multiverse that one doesn't exist :)
@@programmingchaos8957 By "perfect" I don't mean a program that finds a solution, but it also needs to have the same probability distibution as the orginal algorithm
Very cool. I think you need to figure out a way to calculate every cell at once though. For example for plants, it's not that it's always the top layer that grows, but rather every plant layer divides and spreads out further, which makes it look like it's growing upward. I don't like the idea of using random. Nothing is random in nature, and pseudo-random is just bias. I'm sure you could come up with more clever ways to do that too. For example, rule 30 cellular automata looks random, but is actually not. Incorporate some clever rules like those that exhibit complex seemingly "random" behavior, but are actually deterministic.
People used Rule 30 as a pseudo-random number generator for a long time after it was developed. pRNGs are all like this, so there isn't really a material distinction between Rule 30 and other methods like it. Also it's not correct to say pRNGs are biased. If they were, we could detect it by running it a whole heap of times, and generating the empirical distribution for the frequency of numbers it generates, and test to see if it's significantly different from a uniform distribution. We use pRNGs _because_ there is no test we have yet developed that can distinguish them from true RNGs. All we know is that they will eventually cycle, because they are determinate.
Very interesting idea! It would be possible to go through and calculate the 'next' value, but not update it until all of the next values were calculated. This is the standard approach for cellular automata like Conway's Game of Life th-cam.com/video/SgrenppLn8c/w-d-xo.html. Of course, it could lead to the problem of picking conflicting choices for neighboring cells and having to fix it on the next iteration. It might give interesting results for the plants, not sure if it matters for the maps or textures. I understand wanting everything to follow solid, deterministic rules. The issue with purely deterministic rules is that they will always lead to the same outcome and I want different outcomes each time. I could use deterministic rules with (pseudo)random starting conditions, but that might give a smaller range of outcomes (and doesn't completely do away with randomness). I prefer to think of the (pseudo)randomness as representing unknown hidden variables. For example, coin flips are not truly random - in the sense that if you knew the exact forces being applied at the beginning of a flip then a physics simulator could tell you whether it would be heads or tails (and I believe that accomplished slight of hand artists can flip a coin exactly N times to control whether its heads or tails). But if I want a coin flip in a game I'm just going to randomly determine heads or tails, not build a physics simulator for coin flips. This is the same idea, for plants I'm not going to try to simulate all of the factors that cause a plant to branch or not, I'm just going to choose to randomly branch. However, that's because of my relatively simple goals. There's probably a good use for a program that does simulate all of those factors. (And a mini-game where you try to apply the forces to get a coin to flip exactly N times might be fun.)
Few, if any, pseudo-random number generators are biased in the sense of generating more 8's than 9's. But all(?) have artifacts that can be detected with subtle statistical tests. The Wikipedia article on pseudo-number generators has a nice overview of some of the issues. My understanding is that because of this for super high risk areas, like generating encryption keys for launch codes, they use hardware random number generators that use physical phenomena, like radio static, to generate random values.
7:59 Ah! That would probably be beneficial for code readability and tutorials, I however cannot be bothered placing small comments so instead have to re-figure out my code every time I come back to it. Come to think of it, there may be a lesson to be learnt here.
I learned the hard way that these types of comments save time in the long run - at least for me. I've come back to code just a few days later and spent hours struggling with it only to realize that I misremembered what the integer values represent.
@@programmingchaos8957 Yeah, normally it takes a few weeks for me to start forgetting things, but if I ever intend on making a sequel I'm out of luck! It doesn't help that my variable/function/anything naming scheme looks like this, DucVelX. I imagine leaving the comments can be very beneficial!
Good catch! It's very similar, but in minimum constraints the algorithm will assign values that violate the constraints with the hope of fixing them later. WFC traditionally will backtrack and undo some of its previous assignments (or start over) if it finds a variable (cell) with no acceptable values (e.g. no terrains that can be assigned without violating constraints). It means the WFC is more likely to find rare solutions, when minimum conflicts will just keep trying invalid values, but it also means that WFC is slower - and more complex to code. As a side note WFC is even more similar to (maybe identical) to the forward checking algorithm for constraint satisfaction.
Not exactly. In WFC values are only assigned if they don't violate any constraints. If the WFC algorithm finds a variable (cell) with no assignable values (e.g. mountains on one side and deep water on the other) it either backtracks, undoing and retrying some previous assignments to avoid the issue, or starts over from scratch. In contrast the minimum conflicts algorithms just assigns the value that leads to the fewest conflicts and hopes it gets fixed on a later pass, by changing some of the neighboring values. The least conflicts algorithm is simpler to code and runs faster, but if there aren't very many solutions it may fail to find a solution with no conflicts. WFC is harder to code and may take longer to run, but is more likely to find a solution when they are rare.
The one thing that always annoys me about this type of procedural generation is that it's highly non-trivial to parallelize this. Of course, you can use ping-pong buffers, and then execute the algorithm for all cells in parallel. However, this means every cell considers a new type based on the old state independently, so after the step, the neighbors of a cell have changed as well, which changes whether the new type was an improvement or not. This means slower convergence (gauß-seidel vs. jacobi is basically a similiar situation) but this doesn't matter if it allows you to run it on the GPU much faster. The real problem is, this just leads to ugly maps. Sure the constraints will be fulfilled, but not in a manner that would satisfy us. So, in order to parallelize this in a meaningful way, we would have to run N threads for some number of randomly selected cells, where those cells are independent (i.e. no two cells are in range of each other). This is non-trivial, but can be done with rejection sampling. However, this just makes things harder to parallelize again. Rejection sampling is kind of a sequential thing, after all. If you try to do it in parallel and at multiple new samples at the same time, those samples are dependent on each other again and would affect each other in whether they need to be rejected or not... We could just use static patterns that divide our set of cells into independent clusters, but then this static non-random pattern will affect the results again. And computing sets of random independent clusters is... well, highly non-trivial. It is not impossible, but trying to make this a parallel thing on the GPU is very unsatisfying. Currently, the "simple" fix I use is to use a separate initialization step. I noticed that for a range of 1, the parallel version seemed to produce comparable results. But the higher the range, the more degenerate it becomes, generating mostly plains and forest in my case. So I simply initialize cells at random in (2*range+1) X (2*range+1) groups, i.e. every 7x7 pixels for range=3 are assigned the same random cell type. This looks much closer to the iterative method. I also tried a different method to select new types. Your current method is iterative and takes the best out of K randomly sampled types. In rare circumstances, it can pick a type that has more conflicts, and this is an important property. If I initialize leastConflicts to the current amount of conflicts and bestType to the current type, results are pretty bad again in my GPU example. Based on this observation, what I am doing right now is to only pick a single random type to try. I compute the signed difference in conflicts and run that through an exponential function. If the new type is an improvement, the exponential function returns a value > 1. Otherwise, the input is negative, and the worse the new type is, the smaller the value and the value will be in [0,1]. Then I use this value as an acceptance probability to allow accepting things that worsen the result (so long as the current cell isn't conflict-free already). Behaviour can be tweaked by multiplying the input to the exponential, values 1 make it less likely to pick worse cells. For a measure of how fast this is: at 1280x720 cells and a range of 3 (so iterating 7x7 cell areas), the first frame my eyes can see when the window opens is already in a stable configuration.
Great comments and suggestions! The exponential function sounds very similar to simulated annealing. With simulated annealing the probability of acceptance of a worse solution decreases over time as the 'temperature' decreases. One question it raises is how to measure 'ugly' maps? They are obvious to the eye, but can we characterize them mathematically? Presumably the 'count' of terrain matters: only plains and forest isn't good. But what is the right amount of mountains (and high mountains) and water (and deep water). And the distribution also matters, maybe a specific distribution of clump sizes? If it were possible to measure that then you could do meta-optimization on the algorithm to tweak it to give the best terrain types. One question, as a I mentioned in the video, I found using high mountains and deep water as 'anchoring' terrain seemed to help. Have you experimented with that approach at all?
@@programmingchaos8957 the way I am initializing the terrain is kind of inspired by this anchoring approach. Since I use larger squares of a uniform type, so I am basically anchoring or biasing the algorithm such that it won't just end up in one of these very uniform solutions. This could of course be combined with what you showed in the video and I think it could be further improved. For example, if you make the squares quite big, the converged solution can have some straight long edges. So one thing that would make sense is to randomly blend over to the surrounding terrain types in this initialization to remove these patterns. I definitely remember watching videos on simulated annealing before. That might be where this idea came from, I felt like I knew about something similar where the exponential function was used and now that you mention it I think it was simulated annealing. Although I feel like I have seen these concepts in some Monte Carlo sampling strategies as well. The exponential function really is quite useful across many applications, recently I learned about how to use what is basically exponential decay for interpolating animations smoothly.
@@maxmustermann3938 I also saw some suggestions to try multi scale systems, a larger cell of forest that when zoomed in has more varied terrain. Which is quite a bit like anchoring terrain regions.
Very similar, but not quite the same (I'm pretty sure). With minimum conflicts the algorithm will assign a value (terrain) even if it violates constraints. With WFC it will only assign conflict free values, if there's a variable/cell has no acceptable values left, WFC will backtrack, undoing past assignments, and try again (or will start over from scratch). This means that WFC is more likely to find a solution if they are rare, but can be much slower. WFC is, as far as I can tell, identical to the forward checking algorithm that is used for constraint satisfaction.
Wave function collapse (WFC) is a bit different. Wave function collapse starts with every variable in a 'superposition' of possible states/values. They you pick a variable (cell) to assign a value to, adjusting the allowed values for all 'surrounding' variables (i.e. any variables that share a constraint with the variable you just assigned). You repeat this process until all variables have a valid value. All of which is very similar between the two algorithms. (There are some nice blog posts our there explaining WFC more clearly than this.) The big difference is that with WFC if you get to a variable that has no allowed values you backtrack undoing the last few value assignments and try again, or give up and start over. Whereas with minimum conflicts you accept assignments that violate some constraints and trust that they'll be fixed later. Minimum conflicts is easier to code, and - if there are lots of possible solutions - faster. WFC is more likely to find a solution if they are rarer.
was anyone able to mimic the plant generation? I have been trying but can't figure out how to program the logic based on his explanation! and my output looks very messy haha
I'm happy to try to help. I began by setting the number of conflicts in checkconflicts() to 0. Then I checked the current square: If the current square is black I had the function immediately return (0 conflicts because black is always allowed). If the current square is green I checked only the three squares below (+1 in the y direction) the current square. If none of those were green I set the conflicts to 1 (a green square must have at least one green square below it). If the current square is red I checked the square above it (-1 y) and set conflicts to 1 if it's not red or green (red must have red or green above it). Then I returned the number of conflicts. The other pieces: the bottom row (max y) is fixed as always green; search was done systematically from top left to bottom right - unlike for the maps - because the plants grow up.
You could make your code simpler by not including a separate condition for undefined tarrain type, and just treating it a regular terrain type which never satisfies any constraints: notAllowed[0] = {100, 100, 100, 100} without changing other constraints: notAllowed[n][0] = 0.
Good catch! It is similar to WFC. The major difference is that minimum conflicts accepts assignments that violate some constraints with the hope that they will be fixed, by changing neighboring cells, in later passes. If WFC finds a conflict it either backtracks, undoing some of the previous assignments, or restarts from scratch.
Interesting idea. If the constraints necessary to make the algorithm work properly were well defined this could be doable. There is a field, genetic programming, that evolves programs to solve problems. Some of the ideas of fitness from that field might translate into constraints.
Don't be hard on yourself. I've been coding for literally decades. And it still took me probably 6 or 8 different projects that used similar range checks to come up with that approach. I remember starting with 8 separate if statements, one for each neighbor, and each with it's own boundary checking conditionals, something like 24 if's altogether :)
Every now and then TH-cam drops a gem in my recommended feed. This is one of those times. Liked and subscribed! 👍
Thanks! I'm very glad your enjoying the videos. If you have an ideas for projects you'd like to see video on, please feel free to let me know.
Exactly my thoughts!
I discovered this video a while back and returned to it today because I'm diving into Procedural World Generation for a game I'm developing. I really wanted to create something unique, rather than just follow the most popular tutorials out there.
Your explanation was fantastic and inspired me to think deeply about the possibilities when you truly invest effort into your work. I haven't come across much content on Constraint Satisfaction in the context of Procedural Generation-perhaps because it can be a bit slow-but your video really opened my eyes. Thank you for sharing such valuable insights!
I’m thrilled to hear it was so useful. Constraint satisfaction has been around for a long time, but I haven’t seen applied to procedural generation much either. I’m not sure why, maybe it’s just viewed as too old to be applicable (?)
@@programmingchaos8957 probably, it also took some time to generate and most of the procedural generation these days just uses noise and calls it a day.
There has been a small rise in popularity of wave function collapse for procedural generation, who knows maybe this will also pop out.
I really love how you show every step of the way, including mistakes and fixing bugs.
You explain everything in a way that I, a novice programmer who finds coding very daunting at times, both understand it and also am comforted to see the code being not as complicated as it may seem at the beginning or as I feel it would be by looking at the result it gives.
Thank you! Part of the reason for showing (some of) the bugs is because that's how real programs are written. It's a bit demoralizing (at least for me) to get the impression that there are people out there who can code long sections of code without a single mistake - not that I think they really exist, but it seems like it watching some tutorials. Part of the reason is Bob Ross' 'happy little accidents' - occasionally a mistake leads to more interesting result - at least with procedural generation. And finally I do make mistakes, if I rerecorded every time I made one I'd never get anything published :)
I am really glad that this is helpful. There are a lot of programming concepts, maybe all of them, that seem impossibly complicated until they 'click' and then you have trouble remembering why they seemed complicated in the first place. As a wise person from my kids' generation said, "sucking at something is the first step to being sorta good at something".
Holy cow it’s my old professor, always loved your work, glad TH-cam recommended this to me (Kyle Morgan, you wouldn’t remember me lol but I was a decent student, software engineer now)
Hi! It's great to hear the degree is serving you well. I hope you're enjoying the work. As you can probably tell I'm still enjoying teaching :) And thought I should help a larger audience.
@@programmingchaos8957 You’re a great teacher and professor. Your research has always had my attention. I’m treating SE as a career, which has served me well, but I miss the science side of CS. I have a library of rigorous math, CS, and physics books and I’m trying to figure out how to be valuable in those fields. Your work is endlessly inspiring and accessible to so many. Thanks for responding, I’ll be watching all your videos on this channel this weekend. Thanks again for not only being a great teacher but also sharing your fascinating research and creative projects with students at every level, cheers!
oh this is an incredibly wholesome reunion
You're very welcome! I have to say that one of the huge advantages of being a professor (besides getting to teach) is the flexibility to try the random projects that seem interesting.
This is a really nice idea. I have to go to work now but will certainly check it our later. Tanks Terry.
Thanks! I hope work goes well, and let me know what you think when you've seen the whole thing :)
I imagine a Sokoban-like level generator with rules like "a crate must not be blocked by walls and other crates" and the like. Great video!
That's a great idea!
The ability to add initial values like you did with the island seems like the coolest feature here. I could imagine drawing in a rough landscape idea with an image editor and then handing those initial conditions to the constraint satisfier to turn it into an actual map
Thanks! I agree that that makes it much more interesting. I think its sometimes referred to as a falloff map if you want to learn more about how it can me done.
This could be used to generate test cases for obscure scenarios.
One time I used constraint programming to generate all possible cases (maximum 100s different cases) with a bunch of weird rules given by business analysts. It proved that they didn't even knew what they were asking. It simplified the rules in the end, because they were cancelling each other in a few cases.
Very cool application! It is surprising how many problems can be reformulated as constraint satisfaction and then there's well defined solution approaches.
@@programmingchaos8957 Indeed. I believe that they sound too formal to be used out there. That's why we don't see more of it.
In more important projects or fields, I believe this method is treated with more importance. I wish I could work with it.
Didn't absorb it all as I have many many things I still need to learn, but the way you explain makes it so much easier and fascinating, you made me feel like I was consuming something that should be paid. Great video!
Also, just a rather silly thing. You might want to flip your webcam next time, so when you gesticulate to the right/left, it'll be the right/left meant to be shown.
Thanks! I'm really glad you found it useful.
Good point about flipping the camera. I should have noticed that during editing. Thanks for the suggestion.
Very nice video!
I do like this concept a lot as I've been interested in procedural pattern generation, but it seems cumbersome to have to set all the values myself. WFC does not require this, but it is not as flexible as the approach you showed.
This has given me some ideas on how to improve some of my own projects.
I hope to see more from you!
Thank you! I think this approach is quite a bit simpler than WFC. The main place WFC will work better is if there are relatively few solutions available, then it's more systematic approach is more likely to find them.
I plan to keep producing, if only because I enjoy it. I do have a fair number of other videos, many on procedural generation, but more specific techniques.
Awesome and inspiring, thank you for posting that. I appreciate that you left in your errors but added "Yes, I know that's incorrect" call-outs so that those of us yelling "You made a mistake!" can relax. I've built a web-based 2D animation framework for browsers with Erlang, JS and websockets, and this gives me an idea of what to use it for.
Thank you! I leave the errors in (and point them out) for a couple of reasons. I think it's a better representation of 'real' programming where we make mistakes, types, etc. I'm afraid that showing perfect coding is a little demoralizing, especially for newer programmers. And this way I don't have to rerecord over and over until its perfect :) I'm glad you appreciate the pointers to the errors.
If you add this to your frame work, feel free to post another comment, I'd love to see it running.
What a gem of a channel I've stumbled upon. Thank you so much for sharing your knowledge. Subbed!
I'm thrilled you're enjoying it! If you have any ideas for projects you'd like to see, let me know.
Always learning with your videos, thank you!
I'm really glad you're finding them helpful!
Youre really good explaining. You sound like a professional professor (not only a youtuber).
Thanks! There's a reason for that :) Because this is my channel and not a University one, I'm not sure if I should mention the name, but the banner in the background makes it easy to figure out if you're interested.
Wonderful! I am going to make this once I finish playing around with the particle life. Now I am trying to make the 3D version in threejs 😀
Sounds great! I struggled a bit with the 3D version because there are nine cells 'under' each cell, so the constraints had to be changed a bit to keep it from just filling with green.
so glad the algorithm picked this one up, awesome channel!!
Thank you! I hope some of the other videos are also useful. And if you have suggestions for topics, please feel free to share them.
one of the most insightful and straightforward videos on this topic on youtube. 5/5 tutorial
Thank you! I'm really glad that it was clear and straightforward, as making the explanation clear and easy to follow is one of my main goals.
Just found this channel, this is really cool! I love it.
Thank you! I'm glad you're enjoying it. If you have any favorite projects you'd like to see coded let me know.
29:18 That's very nice! It's doing an exceptionally good job at making the circle not too perfect looking. Though, I can imagine that it would be obnoxious if the deep water was a kill zone because then the player wouldn't have a precise measurement of how far off from land they can go before getting to deep water.
The deep water issue is an interesting one. How much the shallow water (or any other terrain) spreads depends a lot on how many tries are made to find a terrain the meets all constraints. Few tries and the terrain 'wanders' more. In the deep water case it would probably make sense to do a post pass that smooths out that boundary so it's obvious.
@@programmingchaos8957 Yeah, that or just some easy to spot indicator for when you get closer to deep water.
great content and great channel man, I appreciate how you explain everything you are doing.
Thanks! I'm glad to hear that the explanations are working - that's one of my major goals.
this is a neat technique. interesting that the search window functions to limit the spatial variance frequency.
although i still think it’s generally much better to do multiple passes at lower resolution to bias it, or allow for more complex output from simple rules (like height, temp, and moisture as simple low res runs in that order, each depending on the previous, possibly with simpler rules or techniques, then upscaled, and then used to bias something to pick the final state. simple rules for each but much more complex output).
Thanks for the feedback. I tend to agree that there are a lot of things that could be done to make the terrain more interesting/realistic. I particularly like the idea of applying the algorithm at multiple scales. The purpose of the is video of course was just to be a relatively quick introduction to the approach, and a brief discussion of other application areas.
Very cool video! Nice to explain in detail what you are doing.
Thank you! I'm really glad the explanation was helpful.
very interesting generating
I'm glad you enjoyed it! The amount of variety that can be generated from simple algorithms is fascinating to me.
Rather than randomly generating terrain types in leastConflicts, you can keep a set/arrayList of the types with the least conflicts. Then choose randomly from that list. This avoids the "fill everything with mountains" problem and also does not skip types or generate them twice.
Definitely! I was thinking my approach was a bit easier to code, and was trying to keep things simple. But removing items in a random order from and ArrayList would be pretty simple. Languages with a built in shuffle function would also work well of course.
This is an awesome video! I love the idea of it generating in real time, I could imagine implementing painting-systems where you can live edit the world and paint in your own details.
Thank you! And that's a great idea! Painting in the rough map to start with, 'I want a mountain range here, a deep lake there, a massive forest in the North' and then letting the algorithm fill in the details, and then being able to go in and re-edit, clean up shorelines, merge mountain ranges, add mountain passes, etc. would be fantastic.
How my uneducated 32 year old self taught not-quite 'programmer' thought of this before watching your video: 'bucket algorithms' and 'water algorithms'. Haha ignorance is truly bliss but thank you for the video! I'm trying to figure out how to do this with mpi in the bash terminal.
I'm glad you enjoyed the video. It's been a long time since I've use mpi, so probably can't help you there. But I would love to hear how it turns out.
Remember to include some check that guarantees termination, otherwise this might run forever for unsatisfiable problems
Right! But this is the drawback to local search, because it's fairly random there's no guarantee that it will find a solution even if one exists (technically its an 'incomplete' search). So, the best you can for termination is stop after a lot of tries. The assumption is that if you are doing procedural generation there should be a lot of possible solutions and so the algorithm will find one. If there's only one (or possibly no) solutions it's better to use a complete search algorithm, but they can be harder to code and a lot slower to run.
Just discovered your channel, fantastic video, thanks a lot for sharing this type of content!
Thank you! I'm glad you're enjoying it.
Nicely done!
And extra points for the "I'll notice the spelling later." annotation!
Thanks! I find it can be a bit demoralizing when tutorials make it seem like there are people who can program huge stretches of code completely error free. So, I don't mind showing some of my errors. Of course, it also means I don't have to rerecord those sections, which saves a lot of time :)
@@programmingchaos8957 It's not just demoralizing, but also transports a completely wrong picture of what programming actually is: An iterative process to find a solution for a given problem. Coding errors are an integral part of that.
The few times I arrive at a solution without any errors leave me utterly confused :D
When I teach programming to newbies, I like to show them test driven development. It seems to help them understand the iterative process better and to not get frustrated by failures.
Putting it in the framework of test driven development is a great idea! I teach some test driven development, but for newbies I fear I have a bad habit of planning the algorithm on the board (which is usually simple because I've taught it half a dozen times before) and then just implementing it. At least when I'm teaching I do show the accidental typos and minor errors along the way. But don't do as good a job with the larger, logical errors. Thanks!
Super cool! I love the procedural generation niche
Glad you liked it! Procedural generation and artificial life (especially evolutionary alife) are my two favorite programming areas.
Thanks for another fun and interesting video! I find the wave function collapse stuff to be difficult to follow. But your code seems a to make neat things and is much easier to digest.
Thanks! I originally planned to do a video on Wave Function Collapse, but the more time I spent on it the more I felt that for procedural generation minimum conflicts was a lot simpler to code and would do just as well. As far as I can tell Wave Function Collapse is the same as forward checking for constraint satisfaction, so if you're struggling to understand wave function collapse you could like at descriptions of forward checking for constrain satisfaction.
Wholsome video, fun and educational! Love you terry 🫶
Thank you! I'm glad you're enjoying the content. If there's any topics you'd like to see covered let me know, I'm always looking for new ideas.
Youre calling leastConflicts every frame. That would make it slower (check every frame).
You can fix it by not checking again once it returns success. (Or a count a few times).
But its also cool because you can interactively modify the landscape and let it adapt (eg sandbox destruction).
Its cool in your checkconflict you could also use programatic rules.
Youve used an map wihch is kinda like WFC, but you could have used rules like 'only these many mountain pixels', or 'this much water'. Or like you did with the plants and berry.
Amazing!!
Again you're absolutely right! Instead of doing N checks in leastConflicts and then trying again, leastConflicts could loop until there are no more conflicts before returning. But I liked being able to see how the terrain was changing. And it's good for debugging if something goes wrong. I do like the idea of an extra step in the draw loop where after leastConflicts is called once, it pauses and the user can go in and modify the current map and then rerun leastConflicts.
The map at the very end does something a bit like what you are suggesting with the cities by slightly limiting the number of them. But in general the terrain rules I'm using are fairly simple, mostly to keep the video short and to give everyone something to build on. If you do try out more complicated rules I'd love to hear what the results are.
20:35 a technique i use often in vgames is shuffling an array of indexes and walk those. Warranty you walk visit all the cells in random order only once.
Or you can shuffle in place with random and swap. (Kinda like bubble sort)
Absolutely! I didn't use shuffle because not all languages have it and I didn't want to extend the video by coding it by hand. But it's the perfect solution here.
Please don't stop making these kinds of videos! Thanks!
Thank you! As long as people are getting something out of them I plan to keep making them.
This is a great showcase of how the process of procedural generation can go, and it's very well explained, thank you :)
I do want to note, around 22:30
(dx + x + worldWidth) % worldWidth
= (dx % worldWidth + x % worldWidth + worldWidth % worldWidth) % worldWidth
= (dx % worldWidth + x % worldWidth + 0) % worldWidth
= ( dx + x ) % worldWidth
So it can be shortened :)
Unless P5js has a modulo operator that works differently from normal math that is
Thank you! Making sure the explanations are clear is one of my main goals, so I'm glad that's working.
Keep in mind that dx can be negative (it starts as negative of the range), if x is small (e.g. less than range) then dx + x is negative and, at least in Processing/Java % of a negative is that negative value. E.g. -2 % worldWidth = -2, which will cause an array out of bounds error. So, the +worldWidth is to wrap the negative cases.
@@programmingchaos8957 Ah i understand, thank you
The actual Modulo operator in maths is defined differently then, and i'm honestly not sure why it's done this way in javascript...
I'm not sure either. But I double-checked that -4 modulo 100 does return -4, not 96, which would make your approach work.
@@programmingchaos8957 Unfortunate :( But if Javascript does not follow the normal modulo definition + worldWidth is a clean fix for it
@@zitronenwasser Me neither. It's why I've gotten in the habit of testing the behavior of functions to make sure they act they way I expect. It's rare, but there's nothing worse than spending hours trying to fix a bug that caused by unexpected behaviors rather than a mistake.
This is so cool :) thanks for sharing!
Your very welcome! I'm glad you enjoyed it.
Amazing video!
Thank you! I'm glad you enjoyed.
Interesting. Is it possible for conflict resolution to create infinite loop? Like for example, you fix one set of conflicts and after few iterations you get back at initial set of conflicts (like those cellular automata that regenerate themself)
I'm not sure. I think it's very unlikely for this algorithm to get stuck in a literal loop because it's making random selections to resolve conflicts. So, to get stuck in a loop it would have to be a condition where there are only a few options that it cycles through. On the other hand its very possible for it to search endlessly through different terrain assignments without every finding a conflict free solution - it's just unlikely to loop.
There are other systematic conflict resolution algorithms that never get stuck and can say whether or not a solution exists, but they are much slower and because they are systematic, don't lead to interesting terrains. They are useful for problems like sudoku, where there are very few solutions, or if you want to know if any solution exists.
Sounds similar to wave function collapse. But the process seems a bit different.
Just heard you mentioning WFC, i wish you explained the diff a bit more.
Nice vid.
Unreal has a WFC plugin. But also this video sunds a bit like PCG.
You're right that this is similar to Wave Function Collapse (WFC), but WFC is a bit different. WFC starts with every variable in a 'superposition' of possible states/values. They you pick a variable (cell) to assign a value to, adjusting the allowed values for all 'surrounding' variables (i.e. any variables that share a constraint with the variable you just assigned. For example, if you place forest, every cell around it that doesn't already have a value is limited to mountains, forest, or plains). You repeat this process until all variables have a valid value. All of which is very similar between the two algorithms.
The big difference is that with WFC if you get to a variable that has no allowed values (e.g. because water was assigned one away from mountains and there's nothing valid to put in the middle) the algorithms backtracks, undoing the last few value assignments and tries again, or gives up and starts over. Whereas with minimum conflicts we accept assignments that violate some constraints and trust that they'll be fixed later. Minimum conflicts is easier to code, and - if there are lots of possible solutions - faster. WFC is more likely to find a solution if they are rarer.
Honestly, as far as I can tell WFC is identical to a depth first constraint satisfaction search using forward checking.
If you do use WFC there are some heuristics that can make it much more efficient:
1. Assign values to the variables with the fewest remaining values first. E.g. if cell 1 can be any terrain, and cell 2 can only be mountains or forest, then assign a value to cell 2 first - you're much more likely to reach a solution or a dead end faster and save search time.
2. Assign values to the variables involved in the most constraints first. E.g. a cell with lots of neighbors, rather than only a few.
3. Assign values that give the most flexibility to 'neighboring' variables. In the terrain example it means assign a 'middle' terrain like plains or forest, before an edge terrain like high mountains or deep water. This will lead to solutions faster, but maybe less interesting terrain.
Question for ya when you have a chance, what’s the difference between this and the wave function collapse? I’ve seen that method used for sudoku as well
This minimum conflicts algorithm is willing to assign values that violate constraints with the hope of fixing it on a later pass. My understanding is that WFC won't assign a value that creates conflicts. If it reaches a variable that has no valid assignments it backtracks and undoes recent variable assignments and tries different values for those variables hoping to avoid the conflict (or it starts over). This makes WFC much better at solving problems with very few solutions (like sudoku) and it potentially can report that a problem is unsolvable (there is no valid set of assignments). However, the backtracking is trickier to code and it's potentially much slower.
So, if you know there's a lot of solutions available, which is typical for procedural generation, this minimum conflicts algorithm is simpler and faster. But if there are relatively few solutions available WFC or other exhaustive search algorithms may be necessary to find one of the solutions.
As far as I can tell WFC is very similar (identical?) to the forward checking algorithm for constraint satisfaction.
Subscribed just for the channel name and video intro.
Thanks! I hope you enjoy the videos. If you've got any topics you want to hear about let me know.
Hi. Thanks for the video, it's very clear and easy to follow, despite you using C. I've been wondering about something for a really long time, and I'm wondering if you have any insights. Suppose I have something like a global constraint. What I mean is: in your example concerning forest, beach and ocean, to check if the contraint is satisfied - that forest can't touch the ocean - all you need to do is check a local kernel around each cell to check satisfaction. I'll call this a _local constraint,_ to contrast a _global constraint,_ whereby I have to check the whole array to determine if the constraint is satisfied (maybe in principle). One global constraint that I've been wondering about for example is say, "group-sizes must be distributed according to a power law" - look up something like mathematical percolation or the Ising model at critical temperature to get an idea of what this looks like. I think this global constraint tends to be satisfied by terrain generation in nature, so I was wondering: is there any sort of computational efficiencies we can exploit for global constraints like this; maybe even global constraints of a stochastic nature like this one, where the degrees of freedom aren't really clear until we see _a_ solution? And, could those efficiencies scale to something that might run in some practical time at the size of say Minecraft?
I'm glad you enjoyed it. Making the coding clear is one of my main goals so I'm glad to hear that you found it clear. The language is Java, not C, but at this level they are almost identical. The major difference is the arrays are declared as:
int[][] map; // Java
not
int map[][] // C
Global constraints are a great idea. I think the answer depends in part on the type of constraint. For example, if the constraint was no more than 20% forest then it would be easy to update a global counter as part of the constraint satisfaction algorithm. Every time you added a forest, add one to the counter, every time you removed a forest subtract one. But that's the simple case.
For your group sizes constraint I think the only solution would be a separate pass over the whole map calculating the required statistics. But for a game this would only need to be done once at the beginning, so even if it took a while (tens of seconds) it could be hid behind a loading screen, tutorial, character selection screen, etc.
Another interesting question is does this (at least for the power law) happen automatically with this algorithm? I could see where very large sets of terrain are increasingly unlikely. Or is there a way to seed the initial map to get a desired distribution? E.g. by placing random circles of forest that follow a power law.
@@programmingchaos8957 For the afforementioned algo.s, it's an emergent feature. Both of them are essentially cellular automata that update based on a local kernel passing over the grid element-wise with random fluctuations in cells, which would probably be more efficient than passing over the whole thing, checking the statistics and then making an adjustment and checking to see if the statistics are closer, etc. (plus I imagine that'd look more unatural), but scaling these things up is still a nightmare. I tried a while ago getting them to a good enough looking resolution, and it got to taking tens of minutes in Python (albeit using a less-efficient algorithm for it; but I kinda had to since I wanted to also track the correlation-length over time).
The thing about the power law distribution is that it's scale-free, so you can zoom into any part of the distribution and it should be similar. This is really the property I'm interested in, because _in principle,_ if you have local information around some region in your grid, then that should also give you global information - but it only givees you the information about the distribution, not about how that distribution is realised. I just thought that it would be nice if I could exploit that fact to allow it to scale faster.
@@programmingchaos8957 Sorry, I typed a response to you, but it disappeared. The aforementioned algorithms are essentially cellular automata that check some local kernel around a cell and transform it depending on that calculated value. The power-law distribution is emergent from these rules. As they are, these algorithms scale pretty poorly both in space and (particularly of interest) time. There are more efficient algorithms that one can use (say for the Ising model), but you end up sacrificing some other things. It might be fine for the purposes of terrain generation, but I wanted to track the correlation-length of the whole system in real-time, which it looks like only the Metropolis algorithm was capable of doing. Nonetheless, I'm still unconvinced about the traditional approaches. I tried scaling these guys up before, and they took me tens of minutes on my machine with Python (I couldn't quite get what I wanted out of Numba; it threw a fit at me when trying to parse multidimensional arrays, so I shelved it. I know some C now, so I might revisit it in Cython).
The thing about power-law distributions, is that they're self-similar at different scales - if you zoom in or out of the distribution, you end up seeing the selfsame distribution. Think: the Pareto 80/20 rule in economics: 80% of the wealth is in 20% of the population; 80% of _that_ wealth is in 20% of that population, etc. etc. Similarly, with these phenomena, if you zoom in or out of these grid-arrays, the group-size distribution is the same. This property is what's known as being "scale-free". This means that _in principle,_ we should have information about the global distribution of group-sizes from local information in some region. I'd hoped I could use that fact to kind of 'compress' the grid in some way, maybe to speed things up. I would say that with these algorithms, seeding can be difficult. What tends to occur is that the major groups end up moving around in non-trivial ways, making them unrecognisable from the initial seed. You can probably obtain utility out of seeding if you don't run them for a very long time, but you want to run these things for a very long time to obtain power-law results.
Thanks for the video!
You're very welcome! I'm glad you enjoyed it.
This is great! subscribed
Thanks! I'm thrilled that you enjoyed it. Hopefully the other videos will be useful as well.
I feel stable now
Glad the video helped :)
you could also check systematically for conflicts without changing anything, keeping one at random to change later. So you'll be sure when it returns false that there are no conflicts.
Definitely. It would be almost the same as the leastConflicts() function, but would go through the map systematically and just return whether it found any conflicts or not.
@@programmingchaos8957 Ah, perhaps I misunderstood that part of the video! it's almost the same :)
It could be interesting to try a hierarchically fractal approaches, similar to how cellular automata caves are sometimes run over several scales -- with the higher resolution passes have constraints both for neighbors and parent layer (coarser scale).
That's a great idea! I can already imagine forest tiles that you can zoom in on and see different types of forest, depending on the neighboring cells, conifers next to the mountains, deciduous next to the plains, and complex mixing in between.
Decent video, love the mod thingie for wraparound. But I must say, this seems like very special case of cellular automata, basically where cell update rule is constraint checking, not that it's bad or anything :) Hm actually even wave function collapse can probably be done as CA, might have to look at that.
I was thinking the exact same thing! A lot of the output, for example the plants, look a lot like the output of CAs. One difference is that typical CA rules only allow one output. For example, if we started with a predefined row of random terrains and filled in the rest of the map using CA rules starting at that row it would normally require rules that defined exactly which terrain went in each cell based on the previous row. But randomized rules should work. It wouldn't allow the map to wrap properly though.
@@programmingchaos8957IMO you are selling CA short :). I'm using CA for map generation for game I'm working on. I'm not sure what you mean with "not allow to wrap properly" I have no problem having it implemented, on hex. And yes you can have non deterministic rules there and all kinds of jazz. I'd say "one output" is also sorta underselling it. (and also here I do not see anything more than one output ether :) First output what that is visible can differ from what is stored (the state), and the state can be arbitrary complex. I.e. in my case my state is multiuple "layers" of ints, so just array :), each with different meanings, mainly used during different iterations. So you could easily have "superposition representing states" too. And also display however you want. I.e. in my case initial runs are just numbers for which territory cell belongs, and I expand them using nondeterministic rules. You can then assign some of them be land (or it can be done before :) ), and just draw map based on that. But then I can run different rules concentrating on height layer, taking into account lands and existing heights, etc, etc. So in all it very flexible, that's why for this I see little difference. Still that generation setup/rule with constraints, is intersting, exp cool to see demonstration on what results look like! Might even experiment with something like it, but I'd probably tweak them a bit with regards to distance (i.e. use euclidean maybe, or some weird randomization for a bit more variety).
You're right, I was too focused on the standard CAs with relatively simple, deterministic rules. Do you have any kind of a development blog? I'd love to check it out.
@@programmingchaos8957 not yet, but I definetly should to help with exposure as I intend to sell it :) might make some videos
14:49 There are a few other ways you could have done this, first of all, you could shuffle a list of every tile so as to prevent looking at the same tile twice, secondly you could just store the old map and the next iteration of it so that you can systematically go through the old map referencing the old map but making all the changes to the new map which should be more efficient. But it should still work fine either way!
A shuffle would definitely work well. I didn't use one only because there isn't a built in shuffle function (at least that I know of) and I didn't want to take the time in the video to code it as well. For the old map, new map, wouldn't that also require recopying the old map back into the new map for next cycle? Or, much more efficient, think of them as map one and map two and just have a flag that says which is being updated on each cycle. That should work well.
@@programmingchaos8957 Yeah, that is the unfortunate thing about having two separate maps, but at least it doesn't have the artifacts!
6:33 For this particular system you could probably generate it more quickly by starting with a 2*2 grid then doubling the resolution and introducing some chaos, repeating this enough times will get the desired pixel resolution without having to generate all the pixels straight up.
Interesting idea. I'm not convinced it would be quicker, but even if it's not, this would be a great way to introduce larger land forms. E.g. mountain ranges that extend across a whole continent, inland seas, etc.
@@programmingchaos8957 While, the reason you could get it quicker is because you can store data from the previous iteration to help speed up the current one, like the edges between beach, land, and sea, those are places where there are likely conflicts so knowing where they are can limit your conflict resolution function to just those spots. Though you may want to have an additional sprinkling of a few random checks to make sure things aren't too uniform.
Is this equivalent to wave function collapse? Or is wave function collapse subtly different?
Wave function collapse (WFC) is a bit different. Wave function collapse starts with every variable in a 'superposition' of possible states/values. They you pick a variable (cell) to assign a value to, adjusting the allowed values for all 'surrounding' variables (i.e. any variables that share a constraint with the variable you just assigned). You repeat this process until all variables have a valid value. All of which is very similar between the two algorithms.
The big difference is that with WFC if you get to a variable that has no allowed values you backtrack undoing the last few value assignments and try again, or give up and start over. Whereas with minimum conflicts you accept assignments that violate some constraints and trust that they'll be fixed later. Minimum conflicts is easier to code, and - if there are lots of possible solutions - faster. WFC is more likely to find a solution if they are rarer.
Honestly, as far as I can tell WFC is identical to a depth first constraint satisfaction search using forward checking.
If you do use WFC there are some heuristics that can make it much more efficient:
1. Assign values to the variables with the fewest remaining values first. E.g. if cell 1 can be any terrain, and cell 2 can only be mountains or forest, then assign a value to cell 2 first - you're much more likely to reach a solution or a dead end faster and save search time.
2. Assign values to the variables involved in the most constraints first. E.g. a cell with lots of neighbors, rather than only a few.
3. Assign values that give the most flexibility to 'neighboring' variables. In the terrain example it means assign a 'middle' terrain like plains or forest, before an edge terrain like high mountains or deep water. This will lead to solutions faster, but maybe less interesting terrain.
A superposition of possible values is commonly used to solve sudokus as well.
Yes, and works a lot better than the minimum conflicts approach shown here, which I suspect would flail around for a long time without finding a solution. Minimum conflicts generally only works well when there are lots of solutions, but then it can be a lot faster than WFC.
3:18 Hm, one potentially effective additional constraint to make this generate better looking land is to make the grey parts in-between land and water require that there can't be a 2*2 square of only grey tiles, this will hopefully prevent too small puddles or chunks of land.
Nice idea! For the video I was intentionally keeping the number of constraints low to reduce the total programming time. But I want to try this out and see how it looks.
@@programmingchaos8957 That's understandable, I hope it looks decent!
I am going to use a similar algorithm to simulate stuff by generating a map with one dimension: time.
Pixels are going to be replaced by states and there is a constraing violation if state[t] can't evolve into state[t + 1].
Also I am going to add time travel to this system.
That's sounds really cool! I'd be interested in hearing about how it turns out if you want to share the results.
When to write the unit test for CheckConstrains? When you first use it, you have in mind what you expect so a perfect time to write the test. If you don't want to interrupt your flow, you delay the test case until you have written out all the use of the results. I'm looking forward to having a reliable AI write the test at the same time as generating the current stub function: not yet implemented...
Writing the tests for the function is a great application area for an AI. I'm a little skeptical of the claims that it will take over programming. LLMs are great at problems that have been solved hundreds of times already, writing a resume, legal brief, or a quick sort algorithm, which makes them seem really impressive. But most programming projects should include some new aspects, requiring new solutions, which LMMs are not so good at. But generating tests for human written code seems like a space they would perform well in.
Interesting how you can easily "filter" randomness to end up with some order, Stable Diffusion works in a very similar way, the only difference is that instead of using some simple logic it uses a UNet to predict the noise.
Also, (not that I have anything against but) you should consider using something like shadertoy, GPUs are much much better at munching tasks that can parallelized, but for learning yea, using that makes it way easier to understand what is going on exactly.
I hadn't thought about the relationship to Stable Diffusion, there is an interesting similarity. And yes a shader would definitely be the way to do this if you wanted it to be super quick or had a much bigger map. As you noted, my focus is really on teaching the algorithm. But it does make me think that I should put together a video or two on basic shader programming and its applications. Thanks for the suggestion.
Very cool, thank you!
Thanks! I'm glad you liked it.
ohhhhh I'm having an idea! How about a game where you play through a map which is generating itself in real time? I wonder if that concept would be viable.... If I had more free time I would start on that tomorrow morning. :)
So, the map isn't decided at all until the player moves into the new areas? That would be interesting, especially if the player's actions influence what appeared next. So, the terrain would have to follow the rules, but if a given cell could be either plans or forest which one it became depended (in part) on the player. For example, if they traveled from forest it was more likely to continue as forest, or they had some ability to influence the probabilities. If you make it Ill play it! :)
@@programmingchaos8957 I didnt' think that far ahead :) I had an idea of the maps generating in a 2D platformer style. Having some gameplay elements advance the generation sounds more viable. Making a game out of that would require some serious deep thinking....
The "Perfect CS" algorithm would look something like this:
```
generate_random_map(); // each pixel is random value
while (!is_map_valid()) { // check if map has any constraint violations
generate_random_map(); // try again
}
```
Of course this would be very inefficient and have exponential time complexity.
So what other algorithms produce the same output (with same probabilities) as "Perfect CS"?
Is algorithm shown in this video "perfect"?
Is WFC "perfect"?
A faster solution using multiverse theory would be:
generate_random_map();
if (!is_map_valid()) destroy_universe();
@@Erkle64 True, but my computer can't destroy the Universe yet :(.
I'm not sure about perfect, but one formal term is 'complete'. A complete algorithm is guaranteed to find a solution if one exists and return false if no solution is possible. The minimum conflicts algorithm I presented is definitely not complete. It is only find likely to find a solution if lots of solutions exist - which should be the case for procedural generation. WFC can be complete if every time it reaches a variable the no longer has a valid assignment (e.g. a cell between forest and deep water) it backtracks, undoing previous assignments and trying new ones until it solves the conflict. But that type of exhaustive search tends to be a lot slower.
With regards to probabilities I suspect, but don't know for sure, that both minimum conflicts and WFC collapse create terrain that's 'clumpier' than your algorithm would produce. Because the easier to find solutions have large clumps of the same terrains. In fact, if a map problem is hard (for example if the range where the cells can't conflict is large) the minimum conflicts tends to 'give up' and just push everything towards a middle terrain like plains.
I want to live in the universe where all of my code compiles and runs correctly the first time. But I suspect that even in an infinite multiverse that one doesn't exist :)
@@programmingchaos8957 By "perfect" I don't mean a program that finds a solution, but it also needs to have the same probability distibution as the orginal algorithm
nice video, unc 😃
Thank you! 😃
Very cool. I think you need to figure out a way to calculate every cell at once though. For example for plants, it's not that it's always the top layer that grows, but rather every plant layer divides and spreads out further, which makes it look like it's growing upward.
I don't like the idea of using random. Nothing is random in nature, and pseudo-random is just bias. I'm sure you could come up with more clever ways to do that too. For example, rule 30 cellular automata looks random, but is actually not. Incorporate some clever rules like those that exhibit complex seemingly "random" behavior, but are actually deterministic.
People used Rule 30 as a pseudo-random number generator for a long time after it was developed. pRNGs are all like this, so there isn't really a material distinction between Rule 30 and other methods like it. Also it's not correct to say pRNGs are biased. If they were, we could detect it by running it a whole heap of times, and generating the empirical distribution for the frequency of numbers it generates, and test to see if it's significantly different from a uniform distribution. We use pRNGs _because_ there is no test we have yet developed that can distinguish them from true RNGs. All we know is that they will eventually cycle, because they are determinate.
Very interesting idea! It would be possible to go through and calculate the 'next' value, but not update it until all of the next values were calculated. This is the standard approach for cellular automata like Conway's Game of Life th-cam.com/video/SgrenppLn8c/w-d-xo.html. Of course, it could lead to the problem of picking conflicting choices for neighboring cells and having to fix it on the next iteration. It might give interesting results for the plants, not sure if it matters for the maps or textures.
I understand wanting everything to follow solid, deterministic rules. The issue with purely deterministic rules is that they will always lead to the same outcome and I want different outcomes each time. I could use deterministic rules with (pseudo)random starting conditions, but that might give a smaller range of outcomes (and doesn't completely do away with randomness).
I prefer to think of the (pseudo)randomness as representing unknown hidden variables. For example, coin flips are not truly random - in the sense that if you knew the exact forces being applied at the beginning of a flip then a physics simulator could tell you whether it would be heads or tails (and I believe that accomplished slight of hand artists can flip a coin exactly N times to control whether its heads or tails). But if I want a coin flip in a game I'm just going to randomly determine heads or tails, not build a physics simulator for coin flips. This is the same idea, for plants I'm not going to try to simulate all of the factors that cause a plant to branch or not, I'm just going to choose to randomly branch. However, that's because of my relatively simple goals. There's probably a good use for a program that does simulate all of those factors. (And a mini-game where you try to apply the forces to get a coin to flip exactly N times might be fun.)
Few, if any, pseudo-random number generators are biased in the sense of generating more 8's than 9's. But all(?) have artifacts that can be detected with subtle statistical tests. The Wikipedia article on pseudo-number generators has a nice overview of some of the issues. My understanding is that because of this for super high risk areas, like generating encryption keys for launch codes, they use hardware random number generators that use physical phenomena, like radio static, to generate random values.
Professor SOULE!!
I was wondering how to spell his surname.
Hey!
The belief, maybe true, but maybe not, is that it came from French, with an accent over the e, but no one knows for sure.
7:59 Ah! That would probably be beneficial for code readability and tutorials, I however cannot be bothered placing small comments so instead have to re-figure out my code every time I come back to it. Come to think of it, there may be a lesson to be learnt here.
I learned the hard way that these types of comments save time in the long run - at least for me. I've come back to code just a few days later and spent hours struggling with it only to realize that I misremembered what the integer values represent.
@@programmingchaos8957 Yeah, normally it takes a few weeks for me to start forgetting things, but if I ever intend on making a sequel I'm out of luck! It doesn't help that my variable/function/anything naming scheme looks like this, DucVelX. I imagine leaving the comments can be very beneficial!
Reminds me of the wave function collapse
Good catch! It's very similar, but in minimum constraints the algorithm will assign values that violate the constraints with the hope of fixing them later. WFC traditionally will backtrack and undo some of its previous assignments (or start over) if it finds a variable (cell) with no acceptable values (e.g. no terrains that can be assigned without violating constraints). It means the WFC is more likely to find rare solutions, when minimum conflicts will just keep trying invalid values, but it also means that WFC is slower - and more complex to code. As a side note WFC is even more similar to (maybe identical) to the forward checking algorithm for constraint satisfaction.
so, Wave function collapse is a subset of this?
Not exactly. In WFC values are only assigned if they don't violate any constraints. If the WFC algorithm finds a variable (cell) with no assignable values (e.g. mountains on one side and deep water on the other) it either backtracks, undoing and retrying some previous assignments to avoid the issue, or starts over from scratch. In contrast the minimum conflicts algorithms just assigns the value that leads to the fewest conflicts and hopes it gets fixed on a later pass, by changing some of the neighboring values. The least conflicts algorithm is simpler to code and runs faster, but if there aren't very many solutions it may fail to find a solution with no conflicts. WFC is harder to code and may take longer to run, but is more likely to find a solution when they are rare.
The one thing that always annoys me about this type of procedural generation is that it's highly non-trivial to parallelize this.
Of course, you can use ping-pong buffers, and then execute the algorithm for all cells in parallel. However, this means every cell considers a new type based on the old state independently, so after the step, the neighbors of a cell have changed as well, which changes whether the new type was an improvement or not. This means slower convergence (gauß-seidel vs. jacobi is basically a similiar situation) but this doesn't matter if it allows you to run it on the GPU much faster. The real problem is, this just leads to ugly maps. Sure the constraints will be fulfilled, but not in a manner that would satisfy us.
So, in order to parallelize this in a meaningful way, we would have to run N threads for some number of randomly selected cells, where those cells are independent (i.e. no two cells are in range of each other). This is non-trivial, but can be done with rejection sampling. However, this just makes things harder to parallelize again. Rejection sampling is kind of a sequential thing, after all. If you try to do it in parallel and at multiple new samples at the same time, those samples are dependent on each other again and would affect each other in whether they need to be rejected or not... We could just use static patterns that divide our set of cells into independent clusters, but then this static non-random pattern will affect the results again. And computing sets of random independent clusters is... well, highly non-trivial.
It is not impossible, but trying to make this a parallel thing on the GPU is very unsatisfying.
Currently, the "simple" fix I use is to use a separate initialization step. I noticed that for a range of 1, the parallel version seemed to produce comparable results. But the higher the range, the more degenerate it becomes, generating mostly plains and forest in my case.
So I simply initialize cells at random in (2*range+1) X (2*range+1) groups, i.e. every 7x7 pixels for range=3 are assigned the same random cell type. This looks much closer to the iterative method.
I also tried a different method to select new types. Your current method is iterative and takes the best out of K randomly sampled types. In rare circumstances, it can pick a type that has more conflicts, and this is an important property. If I initialize leastConflicts to the current amount of conflicts and bestType to the current type, results are pretty bad again in my GPU example.
Based on this observation, what I am doing right now is to only pick a single random type to try. I compute the signed difference in conflicts and run that through an exponential function. If the new type is an improvement, the exponential function returns a value > 1. Otherwise, the input is negative, and the worse the new type is, the smaller the value and the value will be in [0,1]. Then I use this value as an acceptance probability to allow accepting things that worsen the result (so long as the current cell isn't conflict-free already). Behaviour can be tweaked by multiplying the input to the exponential, values 1 make it less likely to pick worse cells.
For a measure of how fast this is: at 1280x720 cells and a range of 3 (so iterating 7x7 cell areas), the first frame my eyes can see when the window opens is already in a stable configuration.
Great comments and suggestions! The exponential function sounds very similar to simulated annealing. With simulated annealing the probability of acceptance of a worse solution decreases over time as the 'temperature' decreases.
One question it raises is how to measure 'ugly' maps? They are obvious to the eye, but can we characterize them mathematically? Presumably the 'count' of terrain matters: only plains and forest isn't good. But what is the right amount of mountains (and high mountains) and water (and deep water). And the distribution also matters, maybe a specific distribution of clump sizes? If it were possible to measure that then you could do meta-optimization on the algorithm to tweak it to give the best terrain types.
One question, as a I mentioned in the video, I found using high mountains and deep water as 'anchoring' terrain seemed to help. Have you experimented with that approach at all?
@@programmingchaos8957 the way I am initializing the terrain is kind of inspired by this anchoring approach. Since I use larger squares of a uniform type, so I am basically anchoring or biasing the algorithm such that it won't just end up in one of these very uniform solutions. This could of course be combined with what you showed in the video and I think it could be further improved. For example, if you make the squares quite big, the converged solution can have some straight long edges. So one thing that would make sense is to randomly blend over to the surrounding terrain types in this initialization to remove these patterns.
I definitely remember watching videos on simulated annealing before. That might be where this idea came from, I felt like I knew about something similar where the exponential function was used and now that you mention it I think it was simulated annealing. Although I feel like I have seen these concepts in some Monte Carlo sampling strategies as well. The exponential function really is quite useful across many applications, recently I learned about how to use what is basically exponential decay for interpolating animations smoothly.
@@maxmustermann3938 I also saw some suggestions to try multi scale systems, a larger cell of forest that when zoomed in has more varied terrain. Which is quite a bit like anchoring terrain regions.
Is this same as the "Wave function collapse" algorithm?
yup
Very similar, but not quite the same (I'm pretty sure). With minimum conflicts the algorithm will assign a value (terrain) even if it violates constraints. With WFC it will only assign conflict free values, if there's a variable/cell has no acceptable values left, WFC will backtrack, undoing past assignments, and try again (or will start over from scratch). This means that WFC is more likely to find a solution if they are rare, but can be much slower. WFC is, as far as I can tell, identical to the forward checking algorithm that is used for constraint satisfaction.
It seems similar to diffusion models but the rules are learned using a dataset.
I hadn't thought about it that way. But I think you're right that there are some similarities. Thanks for pointing it.
wave function collatz?
Wave function collapse (WFC) is a bit different. Wave function collapse starts with every variable in a 'superposition' of possible states/values. They you pick a variable (cell) to assign a value to, adjusting the allowed values for all 'surrounding' variables (i.e. any variables that share a constraint with the variable you just assigned). You repeat this process until all variables have a valid value. All of which is very similar between the two algorithms. (There are some nice blog posts our there explaining WFC more clearly than this.)
The big difference is that with WFC if you get to a variable that has no allowed values you backtrack undoing the last few value assignments and try again, or give up and start over. Whereas with minimum conflicts you accept assignments that violate some constraints and trust that they'll be fixed later. Minimum conflicts is easier to code, and - if there are lots of possible solutions - faster. WFC is more likely to find a solution if they are rarer.
was anyone able to mimic the plant generation? I have been trying but can't figure out how to program the logic based on his explanation! and my output looks very messy haha
I'm happy to try to help. I began by setting the number of conflicts in checkconflicts() to 0. Then I checked the current square:
If the current square is black I had the function immediately return (0 conflicts because black is always allowed).
If the current square is green I checked only the three squares below (+1 in the y direction) the current square. If none of those were green I set the conflicts to 1 (a green square must have at least one green square below it).
If the current square is red I checked the square above it (-1 y) and set conflicts to 1 if it's not red or green (red must have red or green above it).
Then I returned the number of conflicts.
The other pieces: the bottom row (max y) is fixed as always green; search was done systematically from top left to bottom right - unlike for the maps - because the plants grow up.
You could make your code simpler by not including a separate condition for undefined tarrain type, and just treating it a regular terrain type which never satisfies any constraints: notAllowed[0] = {100, 100, 100, 100} without changing other constraints: notAllowed[n][0] = 0.
Nice idea! I think that would be a bit simpler.
its kinda related to wave function collapse algorithm but this is more broad
Good catch! It is similar to WFC. The major difference is that minimum conflicts accepts assignments that violate some constraints with the hope that they will be fixed, by changing neighboring cells, in later passes. If WFC finds a conflict it either backtracks, undoing some of the previous assignments, or restarts from scratch.
So - why not have a sat solver on algorithms themselves, procedurally generate algorithms then constraint solve.
Interesting idea. If the constraints necessary to make the algorithm work properly were well defined this could be doable. There is a field, genetic programming, that evolves programs to solve problems. Some of the ideas of fitness from that field might translate into constraints.
Great content
Thank you! I'm very glad you enjoyed it.
22:46 - I feel very stupid now, I had struggled a lot with rendering in a given range. This is so much cleaner. Not really good at coding though.
Don't be hard on yourself. I've been coding for literally decades. And it still took me probably 6 or 8 different projects that used similar range checks to come up with that approach. I remember starting with 8 separate if statements, one for each neighbor, and each with it's own boundary checking conditionals, something like 24 if's altogether :)
@@programmingchaos8957 That is a bit relieving! Thank you for the Video :=)
Ah, so that's what God used when he created the cosmic microwave background.
I didn't even think of that. Nice spotting.
@@programmingchaos8957 Name drop me when you win the Nobel prize!