maan, you made me cry you know why? I have been trying to explain exact same thing you are describing here to principal engineers for 3 weeks and all of them are against me.
Fantastic - so many developers I work with that push back on DevOps haven't really dug into what it can do for overall developer experience. Those bottlenecks/deadlocks you talk about are REAL. "Well, I have to tell the DBA what database I need in the dev environment, but I don't even know yet!". That stuff puts so many back into a sort of psuedo-waterfall world. Great content as always.
100% agree Viktor! Thanks for this video. I'd love to hear your thoughts on how to manage things like data layers, e.g. a database, redis, pubsub in these ephemeral environments. For the most part, it's possible to spin these up in Kubernetes using the official applications or emulators and seed data, but it's a nightmare to manage, and moving these to the cloud is expensive, especially if you want a unique data layer per ephemeral env.
Would love a video on that as well. In our envs we need database and storage. We use GCP with Cloud SQL (MySQL) and cloud storage. For preview environments we manually copy the database but reuse the staging storage buckets.
The elephant in the room is what you do with databases and other stateful workloads. It is indeed trivial to recreate a "staging/pre-prod" environment from scratch but in reality you also want some data that is close to production. This means that you either need a well disciplined db team that takes a subset of production data, anonymizes it and has it available on demand OR that you have some kind of script automation that not only can create a db but also seed it with realistic data that mimics production. Most people don't have either so they need a permanent staging environment to go along the db which might also used by other teams apart from devs (e.g. Business intelligence) So at least right now, I would suggest most companies to also have a permanent pre-staging/shadow/pre-prod env. In other news, nice to finally see Telepresence get the mention it deserves 🙂
You're right. If you do not have a way to generate data you need, you cannot create envs on a whim and hence cannot have ephemeral envs. That also means that you have bigger problems to solve than the nature of your envs.
Recently, a third option unveiled: DB branching. Databases like Neon / Dolt allow quick and easy snapshotting and branching the data. So you could "seed" some test data in the system, and then for every feature / PR branch off that data and have unique envs. These could even have different schemas. All that is scriptable. And when you shut down such environment, you stop using the CPU. Basically, you only need some storage to store this trunk data.
that is what i am looking for 🎉 btw, to answer your last minute question- CI system triggers Terraform, that do creation of temporary environment for PullRequest (lets say AWS ECS Fargate)
Hey there! Been using Bunnyshell for a while now for managing ephemeral envs (deployed in K8s or through Terraform / Helm charts) and for remote dev. ☁
We are working towards having a "staging" environment as the core environment. Everything works off of Stage. Production gets updates from Stage. Feature/ Dev instances are created off of Stage (and are ephemeral). In fact, the tenants will always start off with a stage instance at first and no production. They can use it to preview apps and how the platform works, mess with the apps with fake data basically to test the apps for fulfilling the tenant's own requirements. If the tenant finds the apps acceptable, they are "promoted" by the tenant to production. Nothing gets through to production, if it hasn't gone through stage (and fulfilled a lot of testing) first. In other words, code changes cannot be made directly in production. So, we need two "constantly live" instances in our system for each tenant. As the production system grows and real data can be used, the Stage instance can get a snapshot of real data at certain internals (not more than once a month, for example). These updates are called refreshes. That is the plan. We still have a good bit of work to do to make it happen. 😁
We've implemented a first phase of ephemeral envs (for dev, features and tests) but with a new cluster for each of them, that's definitely not convenient and it's the reason why we like to move to the next iteration. We want to share the cluster tooling so going forward with a shared cluster makes sense I think. We will also work with preproduction and production environments provided dynamically but we keep preprod permanently to have dbs representing with more fidelity what we have in prod. We're not ready for that (as we have to deal with a better egress management) but we'd also like to rely on this dynamic approach for DR.
I think there is value in having permanent staging depending on your organisation. Non-technical people will want to be aware of what will soon be rolled out. Think of it as an internal beta testing group :). Same if you supply staging and/or mirror environments for external entities to use (b2b partners, or other integrators) But for the purpose of each developer's own feature branch testing, yeah, having dynamic envs is nice :)
I'm totally with you, expect the pre-production, as you usually test the upgrade path from the actual in production version to the new version process wise. still no need to have it up and running for the whole time, but you migth want to have the state of the deployed application and test the migration to the new one, and there is State! NOT everything is as stateless as we would like. at least this step needs to be able to create the config and (partly) payload state of production before you test an update in Pre-Prod.
I really like the devcontainer approach here. Say I have a virtual cluster spin up with vcluster and maybe the needed dependencies already set up (thank you Crossplane); as a developer I can just use vscode connect to a running Pod in that cluster and start working as everything was in my laptop
Still having mostly customers that use permanent environments and it is very difficult to convince a manager not to do this or to begin adopting ephemeral environments. However there are other reasons not to use permanent environments, for example the fact they tend to misalign and are very friendly to snowflake servers. When you deploy a part of an existing system, you are actually merging the systems' artifacts and binaries, and very quickly you actually obtain different systems in ways that are immensely subtle and in fact what you test is not 100% relevant anymore for the next stage.
@@holgerwinkelmann6219 You're probably right. Services and Pods are running on a host cluster so I would be surprised if having a CNI inside a vCluster would work.
I like the idea and in fact in a project I was thinking about doing the same. The issue I've had with this is the kubernetes cluster I work with is on prem and not as scalable on demand as a AWS cluster could be for example. So with enough feature branches in parallel the amount of concurrent systems running could eat up the resources. Permanent test systems meant more easily manageable resource demands. I've instead reserved 3 slots for short lived on demand deployments but have develop and release stages permanent. The advantage to longer running stages is also that you more easily see those problems that tend to rear their ugly gremlin heads whenever a system runs for a couple of days and is exposed to regular test automation etc. It's probably an anti pattern of some kind to test your stability that way, but it kinda works.
Thank you for this video. What are some practical strategies for managing data for development databases? For example, if a developer runs a database inside a virtual cluster, how is that database populated, where does the data come from, and how is the data managed?
I don't think that it makes much of a difference where the db is. You can either connect to an existing db or create a new one. In the latter case, you need a schema (i prefer Schema hero for that) and you need data which, for development purposes often isn't much. You can use the same dataset you use for testing or anything else. At the end of the day, whomever is writing a new feature should, among other things, be able to create a dataset required to test that feature. Only that person knows what I'd needed to demonstrate that a feature works. Otherwise, how is he or she going to develop it. Later on, when it reaches a preproduction stage and you run the whoñw test suite, you might consider a fragment of production data as a dataset.
Let's say we have a constellation of micro-services (lot of them, 30 for example), and I'm applying changes on one of them. Each micro-service have it's own backend, frontend and even database. The micro-service I work on could connect and discuss with multiple other ones, through a message broker or direct HTTPS requests. I think it's not good idea to deploy ALL the micro-services in an ephemeral environment. I can't also only deploy my micro-service and connect to the others because my tests will impact the databases of each one. (And those 'other' micro-services should running in a permanent environment ?) And it's difficult to know exactly, among those 30 micro-services, which ones are necessary for mine to work. What is you advice in this case ? (Love your channel)
It all depends on whether they are truly microservices (developed and deployed independently from each other). If they are, you can connect the one you're working on to dependencies running in production or some other permanent environment. Or, if dependencies are light, you can deploy them in the same ephemeral environments but that is usually the case with very small systems. Even better, you would have well established API contracts. If you do, creating mock should be part of that contract and you could use them instead of direct dependencies (and they would not need dependencies of their own). With good contracts, you do not need to test the whole system but to ensure that you are fulfilling those contracts.
Hi. Great vid. Any specific reason why you are against using GitOps for ephemeral/dynamic environments? My plan was to use ArgoCD PR Genertor for ephemeral envs per application repo, after the image is already built and pushed to ECR. And for dynamic envs, which includes the whole env (not just an app and dependencies), I hope to find a way to structure kustomize overlays and envs in some way, where it will be possible to create an env with a simple `cp -r envs/ envs/` command, and then pass/patch/override env-specific values using awk/sed and kustomize replacements probably. This env is just a separate namespace within an existing cluster, it will include all env-specific AWS dependencies managed with Crossplane, restore DBs from snapshots, and then deploy the k8s apps as well. This copy/delete command can be automated in a pipeline, and deletion will be also just the removal of the `envs/dyn-test-pipeline-114124` folder, for example. I am hoping to use such dynamic envs for qa/test/staging/demo/sandbox environments. ArgoCD will be used for deployment and Crossplane for env-specific AWS/DB dependencies management for each env
The main benefits of gitops is that we know at any time what the desired state is, there is continuous drift detection and reconciliation, and there is no need to open ports since it is a pull-based model. For ephemeral (temporary) envs, i rarely have the need to know what the desired state is (it'll be removed soon anyways). Sync as a manual action or something done in pipelines is often enough (no need to check for the state diffs all the time). Finally, I'm not as protective with those as with permanent envs so there is no big advantage for a pull-based mechanism. On the other hand, gitops does require a push to git for every change and can take a while until the next reconciliation cycle. Loosing a bit of time and pushing to git is nothing compared to the benefits but i feel that we do not much of those benefits when working with temporary envs that disappear soon after they are created. So, more often than not, i simply execute a command or two to create or destroy an ephemeral env as a result of a webhooks triggered from a git repo and executed though a pipeline. I need those anyways and the question is only whether extra steps and time spent by adding gitops to the mix is worthwhile with temp envs.
@@DevOpsToolkit That makes sense. Appreciate the explanation! I guess I could also consider just using Argo app-of-apps pattern and create such envs with a parent Argo app directly (using CLI instead of 'cp -r ...' and commit) that additionally references different application overlays/envs, which already exist in the Gitops repo. Will need to figure out how to use app-of-apps with kustomize then. Thanks!
Great video Viktor i agree 100% with you; there are some use cases where you cannot have an ephemeral environment but in many others, expecially if you are using Cloud services, why waisting money keeping up a devel environment also during the night? In my opinion GitOps can also be used like or in place of "push a button" a way for interaction with something. I don't want for example developers mess up with some GUI , i like more when they commit/push a file which is the things they do and know better. In one of my latest POC i used a configmap to describe an ephemeral EKS cluster which start at 8am and stops at 18pm.
I refer to gitops in strict terms which, among other things, assume pull-based mechanism. I think that push (e.g. webhooks with pipelines) is faster and easier without many downsides when dealing with temporary stuff.
Just execute the cammand from a pipeline triggered through a webhook when a PR is created or closed. It can be kubectl, helm, terraform, or whatever you're using. GitOps requires additional steps and pull-based tools like argo CD or flux that are not always necessary when dealing with permanent envs.
I've managed to deploy to salesforce using Github Actions/Workflows and a custom docker image which has the required binary and configuration (sfdx). Maybe it's not the greatest approach, but it's a deployment and it's not to kubernetes, hahah. Regards, thank you for your work.
You are really amazing I'm impressed with your content 🎉 I was searching in your channel for playlist on storage but couldn't find it can you please do some content on storage such as Rook for Ceph also if you can do video under security about Spiffe (software universal identity). Thanks Victor in advance 👍🏻
You're right. I forgot about it (and Chef). I neither used or saw than for a while so they slipped by me without me noticing or remembering them. Good catch.
If only EKS could spin up faster and the nodes ;-) But also it is good to keep at least your STG env, to be able to test things like infra updates & EKS upgrades. Since you don't re-create your prod on each deployment, it is also good to keep a reference system same way, otherwise you may hit some issues when doing changes in prod, but not in your dev since you created brand new and fresh env with everything clean.
If it takes too much and cannot be sped up and automated, and is used all the time, keep it permanent. On the other hand if, for example, it is not used over night, destroy it and create it in the morning (only an example).
... unless you are in a realm where you have millions of requests per minute, at a constant rate, for the entire year. We _need_ constant environments to run gatling performance tests, and we do it almost every day in environments like performance and staging. Yeah, not everyone is the same :) excellent video, otherwise! P.S. I should really became your patreon, I've been taking far too much good info from you.
If something is happening all the time and there are no conflicts (e.g. no parallel performance tests coming from different suites), the environment should be permanent as long as it is in constant use. What you're describing sounds like a use case for a permanent cluster that is a mirror of production. The cost might be an issue though if it's a large on (massive production and equally massive performance clusters).
It is all nice and dandy until you actually start to have a complex architecture and more than 1-2 dependencies. You mention App2 and App3 in the diagram, where those live? Are they other services that we write, are they cloud provider services or other completely external services or APIs? It is a bit much to say that only production environment should be a permanent one and there are plenty of ways to manage and reduce costs in non-production environments if that is the issue. You can "inject" your feature branch version of an app into a permanent non-prod environment, you can do feature flagging but in the end you need a permanent env that is not prod. One more thing, I don't agree that you can do GitOps only with Kubernetes. GitOps is a way of approaching a problem rather than a specific advantage to K8s. It is a way of defining real world infrastructure or configuration through text that can be kept in a Git repo. Then you apply best practices of testing, CI and CD. You also have ways of making sure you react to changes in your desired state and you are able to detect drift if people still make modification directly rather than through git.
Do you have an example of an environment (other than production) that should exist forever and ever? It does not matter whether it is in cloud, kubernetes, or somewhere else. The point is that other envs are not used all the time and, if that's the case, there is no point running them all the time. There are exceptions though. If it's used all the time, keep it permanent. If you cannot create it automatically and relatively fast (e.g. mainframe), keep it permanent. I'm guessing you're referring to one of those categories and, if you are, i agree. As for gitops... We can skip the arguing part of what it is and what it isn't. Please go to the "gitops principles" section of opengitops.dev/. Now, if there is a tool that fulfills those four points and is not running in kubernetes, i would be very grateful to know what it is. I haven't found any but that does not mean that none exist. Please let me know which tool is it.
@@DevOpsToolkit I went back to the drawing board and actually implemented something like this for a small team with a couple of services. Had a single non-prod environment that was up only when needed and with the ability to inject and use services from a PR depending on some routing rules. Now I understand better your argument and where the old-school way of thinking comes from. What I would like to add is just the fact that for most of these things to function, your team needs to be aware of how things work and how a technology can affect this approach (for example introducing pub/sub). Regarding last thing... Sometimes there is no tool but if you know the principles a big part of the problem is solved, just implement the tool if it is missing. If we all would wait for someone else to implement a tool for our problem there would be no opensource.
Yeah. You're right. But in real life, these decisions are not made by you but by your boss, who knows little or nothing about the advantages of using x or z, and no matter how hard you try to prove that you are right, in the end the best decisions are never made.
That is unfortunately often the case. It's normal that bosses do not know the details. A good boss is the one that trusts experts. A bad one is the one who tries to be in control and micromanage everything.
@@cukiris_ While ephemeral environments are probably not the first thing I'll advocate for in such a situation, they can help. If you can split the process into features and deliver each in a separate environment, testers should have an easier time to test it quicker. P.S. I know that my advice is likely not going to work, but hope dies last so I tried nevertheless. The "real" solution is to get rid of QA as a separate group and incorporate testing as part of a development workflow.
@@DevOpsToolkit This. And even if your boss knows what is up, they either sit in a lead position in development or in operations, so they can make decisions about development or operations processes and tools but not both. But siloed dev and ops will never be devops. Higher ups also think that adding docker and k8s and cicd etc TOOLS will instantly "make devops" for them, when in reality the organization would need to be restructured from the ground up. Restructuring also would mean lot less managers (do-nothings and emailforwarders) with no real knowledge only authority so they obviously will not embrace this change. I think it is nearly impossible to restructure a classic company for devops culture. Victor, I would like to see a video about this topic how and what can we foot-soldiers do in situations like this, except finding a company with real devops culture. The latter is also hard because everything is devops in job description nowdays:)
I am not a good person to make a video on that subject. In one of my previous companies, i spent years fighting for a change. I failed and was left with only one option. I moved to a different company and let the old one to rot. That being said, me failing does not mean that it cannot be done but only that I failed and, hence, do not have a good story to tell on that subject.
I don't think there's a lot of work involved. If you already have everything ready to manage apps in production, translating that to ephemeral environments is trivial. The problem is if you do not have production well defined or if it's based on obsolete tech.
you are totally wrong here. it is a lot of pain to spin up such clusters + all the software with the dependencies. we already have such solutions on GCP and I hate it. GCP is not always fast, creating k8s clusters can take 30 minutes, plus you need to deploy microservices and dependencies etc... Your approach works only for hello world applications.
Having a permanent cluster with apps used as dependencies is great. But, not as a result of making PRs. As for the time needed to create clusters... Have you tried vCluster? It takes seconds to spin up a virtual cluster which might not be good enough for production but is often great for everything else. If an app under development is deployed there, it can easily connect to other apps (dependencies) from a productiob-like permanent cluster. That way, you can have as many or as few as you need.
If something is not working for you, it doesn't mean that it is wrong for everyone... For us temp envs are working great. On the other hand on my previous job there are 3k microservices, so you will not bring all dependencies for testing ever :). They just rolling out everything right to prod with canary. So for them it is also not suitable, which doesn't mean that it is "totally wrong" for everyone
Howe do you create and manage your environments?
maan, you made me cry you know why? I have been trying to explain exact same thing you are describing here to principal engineers for 3 weeks and all of them are against me.
Fantastic - so many developers I work with that push back on DevOps haven't really dug into what it can do for overall developer experience. Those bottlenecks/deadlocks you talk about are REAL. "Well, I have to tell the DBA what database I need in the dev environment, but I don't even know yet!". That stuff puts so many back into a sort of psuedo-waterfall world. Great content as always.
100% agree Viktor! Thanks for this video. I'd love to hear your thoughts on how to manage things like data layers, e.g. a database, redis, pubsub in these ephemeral environments. For the most part, it's possible to spin these up in Kubernetes using the official applications or emulators and seed data, but it's a nightmare to manage, and moving these to the cloud is expensive, especially if you want a unique data layer per ephemeral env.
That's a good idea for a separate video. Adding it to my to-do list...
Would love a video on that as well. In our envs we need database and storage. We use GCP with Cloud SQL (MySQL) and cloud storage. For preview environments we manually copy the database but reuse the staging storage buckets.
yes, this is a pain of all projects I have seen - environments
The elephant in the room is what you do with databases and other stateful workloads. It is indeed trivial to recreate a "staging/pre-prod" environment from scratch but in reality you also want some data that is close to production. This means that you either need a well disciplined db team that takes a subset of production data, anonymizes it and has it available on demand OR that you have some kind of script automation that not only can create a db but also seed it with realistic data that mimics production.
Most people don't have either so they need a permanent staging environment to go along the db which might also used by other teams apart from devs (e.g. Business intelligence)
So at least right now, I would suggest most companies to also have a permanent pre-staging/shadow/pre-prod env.
In other news, nice to finally see Telepresence get the mention it deserves 🙂
You're right. If you do not have a way to generate data you need, you cannot create envs on a whim and hence cannot have ephemeral envs. That also means that you have bigger problems to solve than the nature of your envs.
Recently, a third option unveiled: DB branching. Databases like Neon / Dolt allow quick and easy snapshotting and branching the data. So you could "seed" some test data in the system, and then for every feature / PR branch off that data and have unique envs. These could even have different schemas. All that is scriptable. And when you shut down such environment, you stop using the CPU. Basically, you only need some storage to store this trunk data.
So many options, my head is spinning! Thanks for such an indepth review of all the options.. Now, which to choose...
that is what i am looking for 🎉 btw, to answer your last minute question- CI system triggers Terraform, that do creation of temporary environment for PullRequest (lets say AWS ECS Fargate)
Hey there! Been using Bunnyshell for a while now for managing ephemeral envs (deployed in K8s or through Terraform / Helm charts) and for remote dev. ☁
I've been wanting something like this for years and "theorizing it" like this video shows. Though I would like to see implementation.
We are working towards having a "staging" environment as the core environment. Everything works off of Stage. Production gets updates from Stage. Feature/ Dev instances are created off of Stage (and are ephemeral). In fact, the tenants will always start off with a stage instance at first and no production. They can use it to preview apps and how the platform works, mess with the apps with fake data basically to test the apps for fulfilling the tenant's own requirements. If the tenant finds the apps acceptable, they are "promoted" by the tenant to production. Nothing gets through to production, if it hasn't gone through stage (and fulfilled a lot of testing) first. In other words, code changes cannot be made directly in production. So, we need two "constantly live" instances in our system for each tenant. As the production system grows and real data can be used, the Stage instance can get a snapshot of real data at certain internals (not more than once a month, for example). These updates are called refreshes. That is the plan. We still have a good bit of work to do to make it happen. 😁
We've implemented a first phase of ephemeral envs (for dev, features and tests) but with a new cluster for each of them, that's definitely not convenient and it's the reason why we like to move to the next iteration.
We want to share the cluster tooling so going forward with a shared cluster makes sense I think.
We will also work with preproduction and production environments provided dynamically but we keep preprod permanently to have dbs representing with more fidelity what we have in prod.
We're not ready for that (as we have to deal with a better egress management) but we'd also like to rely on this dynamic approach for DR.
I think there is value in having permanent staging depending on your organisation.
Non-technical people will want to be aware of what will soon be rolled out. Think of it as an internal beta testing group :).
Same if you supply staging and/or mirror environments for external entities to use (b2b partners, or other integrators)
But for the purpose of each developer's own feature branch testing, yeah, having dynamic envs is nice :)
I'm totally with you, expect the pre-production, as you usually test the upgrade path from the actual in production version to the new version process wise. still no need to have it up and running for the whole time, but you migth want to have the state of the deployed application and test the migration to the new one, and there is State! NOT everything is as stateless as we would like. at least this step needs to be able to create the config and (partly) payload state of production before you test an update in Pre-Prod.
Agree 100%
I really like the devcontainer approach here. Say I have a virtual cluster spin up with vcluster and maybe the needed dependencies already set up (thank you Crossplane); as a developer I can just use vscode connect to a running Pod in that cluster and start working as everything was in my laptop
Still having mostly customers that use permanent environments and it is very difficult to convince a manager not to do this or to begin adopting ephemeral environments.
However there are other reasons not to use permanent environments, for example the fact they tend to misalign and are very friendly to snowflake servers. When you deploy a part of an existing system, you are actually merging the systems' artifacts and binaries, and very quickly you actually obtain different systems in ways that are immensely subtle and in fact what you test is not 100% relevant anymore for the next stage.
in terms of vcluster, have you manged to manage differnt CNIs inside a vcluster instance?
I haven't had the need for CNIs inside vClusters so i haven't tried 😔
@@DevOpsToolkit TL,DR soes not work, as beeing a global resource and even worse call commad and file based on node
@@holgerwinkelmann6219 You're probably right. Services and Pods are running on a host cluster so I would be surprised if having a CNI inside a vCluster would work.
I like the idea and in fact in a project I was thinking about doing the same. The issue I've had with this is the kubernetes cluster I work with is on prem and not as scalable on demand as a AWS cluster could be for example.
So with enough feature branches in parallel the amount of concurrent systems running could eat up the resources.
Permanent test systems meant more easily manageable resource demands. I've instead reserved 3 slots for short lived on demand deployments but have develop and release stages permanent.
The advantage to longer running stages is also that you more easily see those problems that tend to rear their ugly gremlin heads whenever a system runs for a couple of days and is exposed to regular test automation etc.
It's probably an anti pattern of some kind to test your stability that way, but it kinda works.
Namespaces or, even better, virtual clusters might help. Have you tried vCluster?
@@DevOpsToolkit I haven't, thanks for the tip. I'll look into it.
Thank you for this video. What are some practical strategies for managing data for development databases? For example, if a developer runs a database inside a virtual cluster, how is that database populated, where does the data come from, and how is the data managed?
I don't think that it makes much of a difference where the db is. You can either connect to an existing db or create a new one. In the latter case, you need a schema (i prefer Schema hero for that) and you need data which, for development purposes often isn't much. You can use the same dataset you use for testing or anything else. At the end of the day, whomever is writing a new feature should, among other things, be able to create a dataset required to test that feature. Only that person knows what I'd needed to demonstrate that a feature works. Otherwise, how is he or she going to develop it. Later on, when it reaches a preproduction stage and you run the whoñw test suite, you might consider a fragment of production data as a dataset.
Let's say we have a constellation of micro-services (lot of them, 30 for example), and I'm applying changes on one of them.
Each micro-service have it's own backend, frontend and even database.
The micro-service I work on could connect and discuss with multiple other ones, through a message broker or direct HTTPS requests.
I think it's not good idea to deploy ALL the micro-services in an ephemeral environment.
I can't also only deploy my micro-service and connect to the others because my tests will impact the databases of each one. (And those 'other' micro-services should running in a permanent environment ?)
And it's difficult to know exactly, among those 30 micro-services, which ones are necessary for mine to work.
What is you advice in this case ?
(Love your channel)
It all depends on whether they are truly microservices (developed and deployed independently from each other). If they are, you can connect the one you're working on to dependencies running in production or some other permanent environment. Or, if dependencies are light, you can deploy them in the same ephemeral environments but that is usually the case with very small systems.
Even better, you would have well established API contracts. If you do, creating mock should be part of that contract and you could use them instead of direct dependencies (and they would not need dependencies of their own). With good contracts, you do not need to test the whole system but to ensure that you are fulfilling those contracts.
Hi. Great vid. Any specific reason why you are against using GitOps for ephemeral/dynamic environments? My plan was to use ArgoCD PR Genertor for ephemeral envs per application repo, after the image is already built and pushed to ECR. And for dynamic envs, which includes the whole env (not just an app and dependencies), I hope to find a way to structure kustomize overlays and envs in some way, where it will be possible to create an env with a simple `cp -r envs/ envs/` command, and then pass/patch/override env-specific values using awk/sed and kustomize replacements probably. This env is just a separate namespace within an existing cluster, it will include all env-specific AWS dependencies managed with Crossplane, restore DBs from snapshots, and then deploy the k8s apps as well. This copy/delete command can be automated in a pipeline, and deletion will be also just the removal of the `envs/dyn-test-pipeline-114124` folder, for example. I am hoping to use such dynamic envs for qa/test/staging/demo/sandbox environments. ArgoCD will be used for deployment and Crossplane for env-specific AWS/DB dependencies management for each env
The main benefits of gitops is that we know at any time what the desired state is, there is continuous drift detection and reconciliation, and there is no need to open ports since it is a pull-based model. For ephemeral (temporary) envs, i rarely have the need to know what the desired state is (it'll be removed soon anyways). Sync as a manual action or something done in pipelines is often enough (no need to check for the state diffs all the time). Finally, I'm not as protective with those as with permanent envs so there is no big advantage for a pull-based mechanism. On the other hand, gitops does require a push to git for every change and can take a while until the next reconciliation cycle. Loosing a bit of time and pushing to git is nothing compared to the benefits but i feel that we do not much of those benefits when working with temporary envs that disappear soon after they are created. So, more often than not, i simply execute a command or two to create or destroy an ephemeral env as a result of a webhooks triggered from a git repo and executed though a pipeline. I need those anyways and the question is only whether extra steps and time spent by adding gitops to the mix is worthwhile with temp envs.
@@DevOpsToolkit That makes sense. Appreciate the explanation! I guess I could also consider just using Argo app-of-apps pattern and create such envs with a parent Argo app directly (using CLI instead of 'cp -r ...' and commit) that additionally references different application overlays/envs, which already exist in the Gitops repo. Will need to figure out how to use app-of-apps with kustomize then. Thanks!
Great video Viktor i agree 100% with you; there are some use cases where you cannot have an ephemeral environment but in many others, expecially if you are using Cloud services, why waisting money keeping up a devel environment also during the night?
In my opinion GitOps can also be used like or in place of "push a button" a way for interaction with something. I don't want for example developers mess up with some GUI , i like more when they commit/push a file which is the things they do and know better. In one of my latest POC i used a configmap to describe an ephemeral EKS cluster which start at 8am and stops at 18pm.
I refer to gitops in strict terms which, among other things, assume pull-based mechanism. I think that push (e.g. webhooks with pipelines) is faster and easier without many downsides when dealing with temporary stuff.
What do we need to use for preview envs if not GitOps? I didn't get it from the video.
Just execute the cammand from a pipeline triggered through a webhook when a PR is created or closed. It can be kubectl, helm, terraform, or whatever you're using. GitOps requires additional steps and pull-based tools like argo CD or flux that are not always necessary when dealing with permanent envs.
I've managed to deploy to salesforce using Github Actions/Workflows and a custom docker image which has the required binary and configuration (sfdx). Maybe it's not the greatest approach, but it's a deployment and it's not to kubernetes, hahah.
Regards, thank you for your work.
I need to wait for Joe to finish his work before I start mine and Michael needs to wait for Joe and me 😂 Resource starvation 101.
You are really amazing I'm impressed with your content 🎉
I was searching in your channel for playlist on storage but couldn't find it can you please do some content on storage such as Rook for Ceph also if you can do video under security about Spiffe (software universal identity).
Thanks Victor in advance 👍🏻
Adding both to my to-do list...
I would argue that using Puppet master + puppet agent + auto run is a form of GitOps that can be used outside Kubernetes. No?
You're right. I forgot about it (and Chef). I neither used or saw than for a while so they slipped by me without me noticing or remembering them. Good catch.
If only EKS could spin up faster and the nodes ;-)
But also it is good to keep at least your STG env, to be able to test things like infra updates & EKS upgrades. Since you don't re-create your prod on each deployment, it is also good to keep a reference system same way, otherwise you may hit some issues when doing changes in prod, but not in your dev since you created brand new and fresh env with everything clean.
If it takes too much and cannot be sped up and automated, and is used all the time, keep it permanent. On the other hand if, for example, it is not used over night, destroy it and create it in the morning (only an example).
@@DevOpsToolkitcheck out codesphere, its still in beta but will support helm soon, can spin up in one second
... unless you are in a realm where you have millions of requests per minute, at a constant rate, for the entire year. We _need_ constant environments to run gatling performance tests, and we do it almost every day in environments like performance and staging. Yeah, not everyone is the same :) excellent video, otherwise! P.S. I should really became your patreon, I've been taking far too much good info from you.
If something is happening all the time and there are no conflicts (e.g. no parallel performance tests coming from different suites), the environment should be permanent as long as it is in constant use. What you're describing sounds like a use case for a permanent cluster that is a mirror of production. The cost might be an issue though if it's a large on (massive production and equally massive performance clusters).
The bubble sound effects distract from your discussion.
Great feedback. I'll lower or remove them.
It is all nice and dandy until you actually start to have a complex architecture and more than 1-2 dependencies. You mention App2 and App3 in the diagram, where those live? Are they other services that we write, are they cloud provider services or other completely external services or APIs? It is a bit much to say that only production environment should be a permanent one and there are plenty of ways to manage and reduce costs in non-production environments if that is the issue. You can "inject" your feature branch version of an app into a permanent non-prod environment, you can do feature flagging but in the end you need a permanent env that is not prod.
One more thing, I don't agree that you can do GitOps only with Kubernetes. GitOps is a way of approaching a problem rather than a specific advantage to K8s. It is a way of defining real world infrastructure or configuration through text that can be kept in a Git repo. Then you apply best practices of testing, CI and CD. You also have ways of making sure you react to changes in your desired state and you are able to detect drift if people still make modification directly rather than through git.
Do you have an example of an environment (other than production) that should exist forever and ever? It does not matter whether it is in cloud, kubernetes, or somewhere else. The point is that other envs are not used all the time and, if that's the case, there is no point running them all the time. There are exceptions though. If it's used all the time, keep it permanent. If you cannot create it automatically and relatively fast (e.g. mainframe), keep it permanent. I'm guessing you're referring to one of those categories and, if you are, i agree.
As for gitops... We can skip the arguing part of what it is and what it isn't. Please go to the "gitops principles" section of opengitops.dev/. Now, if there is a tool that fulfills those four points and is not running in kubernetes, i would be very grateful to know what it is. I haven't found any but that does not mean that none exist. Please let me know which tool is it.
@@DevOpsToolkit I went back to the drawing board and actually implemented something like this for a small team with a couple of services. Had a single non-prod environment that was up only when needed and with the ability to inject and use services from a PR depending on some routing rules. Now I understand better your argument and where the old-school way of thinking comes from. What I would like to add is just the fact that for most of these things to function, your team needs to be aware of how things work and how a technology can affect this approach (for example introducing pub/sub).
Regarding last thing... Sometimes there is no tool but if you know the principles a big part of the problem is solved, just implement the tool if it is missing. If we all would wait for someone else to implement a tool for our problem there would be no opensource.
Yeah. You're right. But in real life, these decisions are not made by you but by your boss, who knows little or nothing about the advantages of using x or z, and no matter how hard you try to prove that you are right, in the end the best decisions are never made.
That is unfortunately often the case. It's normal that bosses do not know the details. A good boss is the one that trusts experts. A bad one is the one who tries to be in control and micromanage everything.
@@DevOpsToolkit Imagine a qa team that takes 2 months to test in a test environment. How to have dynamic environment? This is my real life.
@@cukiris_ While ephemeral environments are probably not the first thing I'll advocate for in such a situation, they can help. If you can split the process into features and deliver each in a separate environment, testers should have an easier time to test it quicker.
P.S. I know that my advice is likely not going to work, but hope dies last so I tried nevertheless. The "real" solution is to get rid of QA as a separate group and incorporate testing as part of a development workflow.
@@DevOpsToolkit This. And even if your boss knows what is up, they either sit in a lead position in development or in operations, so they can make decisions about development or operations processes and tools but not both. But siloed dev and ops will never be devops. Higher ups also think that adding docker and k8s and cicd etc TOOLS will instantly "make devops" for them, when in reality the organization would need to be restructured from the ground up. Restructuring also would mean lot less managers (do-nothings and emailforwarders) with no real knowledge only authority so they obviously will not embrace this change. I think it is nearly impossible to restructure a classic company for devops culture. Victor, I would like to see a video about this topic how and what can we foot-soldiers do in situations like this, except finding a company with real devops culture. The latter is also hard because everything is devops in job description nowdays:)
I am not a good person to make a video on that subject. In one of my previous companies, i spent years fighting for a change. I failed and was left with only one option. I moved to a different company and let the old one to rot. That being said, me failing does not mean that it cannot be done but only that I failed and, hence, do not have a good story to tell on that subject.
Too many repetitions about the same
Seems to me a lot of work just to start the real development
I don't think there's a lot of work involved. If you already have everything ready to manage apps in production, translating that to ephemeral environments is trivial. The problem is if you do not have production well defined or if it's based on obsolete tech.
you are totally wrong here. it is a lot of pain to spin up such clusters + all the software with the dependencies. we already have such solutions on GCP and I hate it. GCP is not always fast, creating k8s clusters can take 30 minutes, plus you need to deploy microservices and dependencies etc... Your approach works only for hello world applications.
Having a permanent cluster with apps used as dependencies is great. But, not as a result of making PRs. As for the time needed to create clusters... Have you tried vCluster? It takes seconds to spin up a virtual cluster which might not be good enough for production but is often great for everything else. If an app under development is deployed there, it can easily connect to other apps (dependencies) from a productiob-like permanent cluster. That way, you can have as many or as few as you need.
If something is not working for you, it doesn't mean that it is wrong for everyone...
For us temp envs are working great.
On the other hand on my previous job there are 3k microservices, so you will not bring all dependencies for testing ever :). They just rolling out everything right to prod with canary.
So for them it is also not suitable, which doesn't mean that it is "totally wrong" for everyone
...also, connecting apps from temp envs with those from permanent ones (e.g. staging or production) is both valid and common.
@@len4ezz Most underrated comment here!