I was initially excited by the title but this doesn't appear to apply to my use case. How would this work with frameworks that manages their own migrations - like Django - if it even does? Currently you can use an initContainer or job to apply the migration, but I don't like either way. Definitely would like more DB/Postgresql related videos.
What Viktor said. For example, we recently decoupled migrations from all of our various applications since we're moving (back) to a single DB. This means all of the Django models are "unmanaged", and we keep the schema changes in LiquidBase.
Great Content !. Like all your videos. One question viktor. Is there a similar kind of k8s native tool for application testing ? . Test cases defined using YAMLs and a 'test' controller picks these test cases and run.
I haven't been exploring that option myself. For now, I tend to use Argo Workflows and Tekton to orchestrate steps, including testing. It would be great to have something like that though. I'll look into it...
This is AWESOME and will fix a lot of my personal issue with database migrations using GitOps. I was wondering how/if this can be coupled with ArgoCD? I already have my applications deployed with ArgoCD and this will be the missing piece to my setup. Thanks!
Being able to define everything as k8s manifests is the goal, partly because that enables integrations with the tools like Argo CD and Flux. In the case of SchemaHero, I keep its manifests together with the rest of the manifests of an app and letting Argo CD figure out what to sync and when. GitOps through Argo CD or Flux is one of the reasons why today I prefer KubeVela for apps, Crossplane for infra and services, and SchemaHero for DB migrations. So, the short answer to your question is a huge YES.
Another very good and useful video. Thank you But an important question is backups. How can we sync database backups and k8s resources in a recovery use case?
As far as I know, SchemaHero is not solving the "backup issue". I'm not even sure whether it should even try to do that since most DB providers have that solved. An alternative to a DB-specific backup/restore solution might be velero.io/ (if you're running the DB in k8s).
What about existing tables? do we need to yamlify all existing tables? I think this could be for the users who about to start and setup there CICD processes. For now we are using k8s jobs for migrations but I would eventually like a better approach. To use this I guess first we need to make the database git based using yaml and then start using it?
@@DevOpsToolkit yeah I got your point. I wanted to use it for existing database and tables. I think that won't be the case right? Anyways thanks for the lovely videos and content. You rock !!
I abandoned SchemaHero. I used it at the time since it was the only one designed to run in Kubernetes. It was the best not because it was good but b cause it was the only one. In the meantime Atlas released Kubernetes operator and became my go-to tool for managing db schemas. You'll find a few videos about it in this channel.
Couple of questions: - What about more complex database objects, like Views, Materialized Views, Enums, Custom Types, Database Stored Procedures and Functions, Triggers, ... how good is SchemaHero in finding out differences in their versions in order to figure out a migration to apply? Changes in these database objects sometimes means that the whole objects need to be dropped and recreated, along with cascading to all the dependencies and the dependencies might be objects that are slow to recreate, like materialized views. In my opinion, fully automating database schema migrations with only specifying a target state is still not very practical for complex database schemas and the schemas with a lot of data. - Does SchemaHero helps in eventually blocking all the traffic from our application to the database (and maybe even to our application that depends on our database and resides in the same Kubernetes cluster) while database migrations are being applied? Complex database migrations can be lengthy and may lock the tables, making the database unusable while migrations are being applied. - Would very much like to hear your opinion about the Flyway. My personal opinion is that it is useful in automating database migrations for Project and PaaS business models (where there is only one production environment), while it has serious problems and is not very usable in Product business model where there are many production environments, each one for each customer and on the potentially different version (i.e. when each customer has different release approval policies).
- As far as I know, SchemaHero is working only with tables. If you do have things like stored procedures, SchemaHero is off the table. - SchemaHero does not block any traffic. That would need to be done on the application level and is (and should be) out of the scope of SchemaHero. Even if you do block traffic directly in the database, that would result is random experience from the user's perspective. Personally, in almost all the cases, I would not accept any downtime produced by DB migrations. That would reflect very badly on business. Imagine Google not working because they're migrating DBs. That being said, there are cases when prolonged downtime is needed, but that needs to be a calculated not-to-be-repeated exception. - FlyWay is great. I love it. Now, keep in mind that I'm saying that from the perspective of daily usage (just as SchemaHero) and databases that are changing in short iteration (opposed to infrequent accumulated changes). There might situation when things go terribly and we might need workaround, but those are exceptions (in my case). What is your preference for the cases when FlyWay is not a good option? Liquibase?
@@DevOpsToolkit I was more thinking that somehow SchemaHero can tell Kubernetes to block traffic to either database or to our application depending on the database, if it is tightly integrated into Kubernetes. - Re Flyway, when it did not worked for my company, we have manually written condensed migrations scripts (accumulating many past migrations into one big migration, but more efficient that just executing many migrations serially) and written custom automation tools to run them. I agree Flyway is great when target database is never lagging a lot behind.
@@miletacekovic Unfortunately, SchemaHero does not block traffic. You'd need to do that separately and it would depend on what you're using for networking.
@@TechStory5 If by usage you mean that different tenants should share the same DB. that's not a problem, at least not technically. You just need to give those tenants access to the same DB. SchemaHero is not trying to help there. It's not about how someone or something accesses the DB but how DB schemas are managed. I might have misunderstood your question. Please let me know if that's the case.
Somehow the workflow of developing features with this seems weird to me. As you mentioned, app and db management need to be decoupled to be able to use SchemaHero. Is it a good pattern to decouple schema management from the app using it?
That depends on the type of decoupling. In case of SchemaHero, migrations are not part of the code of the app, the are part of the definitions of the app together with the rest (e.g., Deployment, Service, etc.).
@@DevOpsToolkit Thanks Viktor, that´s clear! I´m just asking myself if this is at all a desirable pattern. It is good practice that every Microservice has its own schema. So when would it even make sense to separate schema management from Microservice? Most probably then the information of app change + migration will live in different repos, two PRs need to be reviewed, merging order has to be correct, every migration needs to be 100% backwards compatible (well it should be in any case) and 'whatsonot' :). Just looking for use cases where it was best practice to separate migrations from Microservice. I love your work Viktor, keep it coming! :)
@@sticksen I do agree that it's best if every microservice has its own schema and I do not think that schema management should be separate from microservice. I tend to keep schemas together with manifests of a service and those manifests are in the same repo as the code of that app. In other words, I do not keep manifests in a different repo (schema being some of those manifests). What I do keep separate from apps are env manifests that are mostly links to the app manifests with few env-specific variables. So, my Argo CD or Flux are pointing to env repos which, in turn, are pointing to app repos.
The thing that bothers me about SchemaHero is that now my schema migrations is Kubernetes aware. I’d rather have migrations that are agnostic of where it is running just like my app code is unaware of Kubernetes. That way I can run them locally to test without needing to spin up a k8s cluster.
I guess that depends on whether you consider DB schemas to be closer to the code or manifests. I tend to see them being closer to the latter. They are, in a way, similar (or part of) deployment manifests, and we are reaching the point where most of those are k8s YAML definitions. Also, I imagine that (almost) everyone is using a local Kubernetes cluster (k3d, KinD, Minikube, etc.) for development anyways, so using SchemaHero locally shouldn't be a problem. All that being said, I think it mostly depends on the level of adoption one has of Kubernetes. If it's low, moving to SchemaHero might be premature. On the other hand, if the adoption is high and most of the stuff is already running in Kubernetes, using SchemaHero could result in a simplification of the process by relying on a single API (Kube API) to manage everything. What I'm really trying to say is that I would not adopt SchemaHero if I would be someone at the beginning of the Kubernetes journey but, if I'm far off, sooner or later I'd want a consolidation under a single API.
@@DevOpsToolkit I’m not so sure about your last paragraph. I’ve been using k8s since 2015/16 and have deployed different applications from different languages (java, nodejs, puthon, etc) on it but I still don’t feel comfortable with SchemaHero. I don’t really see the point of making one’s DB migrations k8s dependent. Agree to disagree, I guess. 🙂
@@MarkMaglana Agree to disagree as well. Conversations like this are great. Among other things, it gives me much broader perspective and help me "tune" my views to be more in-line with what the wider industry is doing.
Cool finding! I found liquibase and flyway very useful if you use java application because the nature of the language and the easy integration. However I found this tool useful for other languages that don`t have this integration by default and need exterrnal database life cycle way
@@fpvclub7256 There are some alternatives to solve the locking problem using kubernetes with liquibase. Other alternatives such as flyway are language specific so it is less agnostic.
@@javisartdesign Why do You think that FlyWay is language specific? Yes it can run Java migrations, but it can be used as a Docker images with SQLs files mounted as volume for any other languages App
@@alexandrulazarev6207 With flyway it's more complicate to support multiple databases at the sametime (oracle, postgres, etcc) so you have to include placeholders and conditions for statements. However with liquibase is capable to detect the engine and generate sql statements for that particular database being used.
How do you manage your DB schemas (in case your DB does have a schema)?
Using flyway or a private migration tool based on SQL.
@@FernandoAlmeida1973 FlyWay is great :)
Clearly in my top 5 youtube influencer !!
Thank you for this great video (and for all others :) )
Great video Victor..!
I was initially excited by the title but this doesn't appear to apply to my use case. How would this work with frameworks that manages their own migrations - like Django - if it even does? Currently you can use an initContainer or job to apply the migration, but I don't like either way. Definitely would like more DB/Postgresql related videos.
That solution works only in situations where app and DB management are decoupled.
What Viktor said. For example, we recently decoupled migrations from all of our various applications since we're moving (back) to a single DB. This means all of the Django models are "unmanaged", and we keep the schema changes in LiquidBase.
Thanks for sharing :) Love it.
Great Content !. Like all your videos. One question viktor. Is there a similar kind of k8s native tool for application testing ? . Test cases defined using YAMLs and a 'test' controller picks these test cases and run.
I haven't been exploring that option myself. For now, I tend to use Argo Workflows and Tekton to orchestrate steps, including testing. It would be great to have something like that though. I'll look into it...
Testkube
This is AWESOME and will fix a lot of my personal issue with database migrations using GitOps. I was wondering how/if this can be coupled with ArgoCD? I already have my applications deployed with ArgoCD and this will be the missing piece to my setup. Thanks!
Being able to define everything as k8s manifests is the goal, partly because that enables integrations with the tools like Argo CD and Flux. In the case of SchemaHero, I keep its manifests together with the rest of the manifests of an app and letting Argo CD figure out what to sync and when.
GitOps through Argo CD or Flux is one of the reasons why today I prefer KubeVela for apps, Crossplane for infra and services, and SchemaHero for DB migrations.
So, the short answer to your question is a huge YES.
Another very good and useful video. Thank you
But an important question is backups. How can we sync database backups and k8s resources in a recovery use case?
Or do database backups make sense on a Kubernetes native environment?
As far as I know, SchemaHero is not solving the "backup issue". I'm not even sure whether it should even try to do that since most DB providers have that solved. An alternative to a DB-specific backup/restore solution might be velero.io/ (if you're running the DB in k8s).
What about existing tables? do we need to yamlify all existing tables? I think this could be for the users who about to start and setup there CICD processes. For now we are using k8s jobs for migrations but I would eventually like a better approach. To use this I guess first we need to make the database git based using yaml and then start using it?
You do not have to add existing tables to schemahero. It is not mandatory for everything to be in it.
@@DevOpsToolkit yeah I got your point. I wanted to use it for existing database and tables. I think that won't be the case right?
Anyways thanks for the lovely videos and content. You rock !!
Ok but how do you integrate Sql server migration scripts in this kind of json format?
I abandoned SchemaHero. I used it at the time since it was the only one designed to run in Kubernetes. It was the best not because it was good but b cause it was the only one. In the meantime Atlas released Kubernetes operator and became my go-to tool for managing db schemas. You'll find a few videos about it in this channel.
Couple of questions:
- What about more complex database objects, like Views, Materialized Views, Enums, Custom Types, Database Stored Procedures and Functions, Triggers, ... how good is SchemaHero in finding out differences in their versions in order to figure out a migration to apply? Changes in these database objects sometimes means that the whole objects need to be dropped and recreated, along with cascading to all the dependencies and the dependencies might be objects that are slow to recreate, like materialized views. In my opinion, fully automating database schema migrations with only specifying a target state is still not very practical for complex database schemas and the schemas with a lot of data.
- Does SchemaHero helps in eventually blocking all the traffic from our application to the database (and maybe even to our application that depends on our database and resides in the same Kubernetes cluster) while database migrations are being applied? Complex database migrations can be lengthy and may lock the tables, making the database unusable while migrations are being applied.
- Would very much like to hear your opinion about the Flyway. My personal opinion is that it is useful in automating database migrations for Project and PaaS business models (where there is only one production environment), while it has serious problems and is not very usable in Product business model where there are many production environments, each one for each customer and on the potentially different version (i.e. when each customer has different release approval policies).
- As far as I know, SchemaHero is working only with tables. If you do have things like stored procedures, SchemaHero is off the table.
- SchemaHero does not block any traffic. That would need to be done on the application level and is (and should be) out of the scope of SchemaHero. Even if you do block traffic directly in the database, that would result is random experience from the user's perspective. Personally, in almost all the cases, I would not accept any downtime produced by DB migrations. That would reflect very badly on business. Imagine Google not working because they're migrating DBs. That being said, there are cases when prolonged downtime is needed, but that needs to be a calculated not-to-be-repeated exception.
- FlyWay is great. I love it. Now, keep in mind that I'm saying that from the perspective of daily usage (just as SchemaHero) and databases that are changing in short iteration (opposed to infrequent accumulated changes). There might situation when things go terribly and we might need workaround, but those are exceptions (in my case).
What is your preference for the cases when FlyWay is not a good option? Liquibase?
@@DevOpsToolkit I was more thinking that somehow SchemaHero can tell Kubernetes to block traffic to either database or to our application depending on the database, if it is tightly integrated into Kubernetes.
- Re Flyway, when it did not worked for my company, we have manually written condensed migrations scripts (accumulating many past migrations into one big migration, but more efficient that just executing many migrations serially) and written custom automation tools to run them. I agree Flyway is great when target database is never lagging a lot behind.
@@miletacekovic Unfortunately, SchemaHero does not block traffic. You'd need to do that separately and it would depend on what you're using for networking.
Can i use schemaHero in a multi-tenant architecture to share Database schema usage between tenants ?
Would it be the same DB inside a shared database instance or different ones (one for each tenant)?
@@DevOpsToolkit same
@@TechStory5 If by usage you mean that different tenants should share the same DB. that's not a problem, at least not technically. You just need to give those tenants access to the same DB. SchemaHero is not trying to help there. It's not about how someone or something accesses the DB but how DB schemas are managed.
I might have misunderstood your question. Please let me know if that's the case.
@@DevOpsToolkit Yes Thank you
Somehow the workflow of developing features with this seems weird to me. As you mentioned, app and db management need to be decoupled to be able to use SchemaHero. Is it a good pattern to decouple schema management from the app using it?
That depends on the type of decoupling. In case of SchemaHero, migrations are not part of the code of the app, the are part of the definitions of the app together with the rest (e.g., Deployment, Service, etc.).
@@DevOpsToolkit Thanks Viktor, that´s clear! I´m just asking myself if this is at all a desirable pattern. It is good practice that every Microservice has its own schema. So when would it even make sense to separate schema management from Microservice? Most probably then the information of app change + migration will live in different repos, two PRs need to be reviewed, merging order has to be correct, every migration needs to be 100% backwards compatible (well it should be in any case) and 'whatsonot' :).
Just looking for use cases where it was best practice to separate migrations from Microservice. I love your work Viktor, keep it coming! :)
@@sticksen I do agree that it's best if every microservice has its own schema and I do not think that schema management should be separate from microservice. I tend to keep schemas together with manifests of a service and those manifests are in the same repo as the code of that app. In other words, I do not keep manifests in a different repo (schema being some of those manifests). What I do keep separate from apps are env manifests that are mostly links to the app manifests with few env-specific variables. So, my Argo CD or Flux are pointing to env repos which, in turn, are pointing to app repos.
@@DevOpsToolkit That explains it perfectly! Thanks 🙏🏻😀
The thing that bothers me about SchemaHero is that now my schema migrations is Kubernetes aware. I’d rather have migrations that are agnostic of where it is running just like my app code is unaware of Kubernetes. That way I can run them locally to test without needing to spin up a k8s cluster.
I think if it limited its scope to managing the lifecycle of an existing db migration tool (e.g. Flyway) then it would’ve been better.
I guess that depends on whether you consider DB schemas to be closer to the code or manifests. I tend to see them being closer to the latter. They are, in a way, similar (or part of) deployment manifests, and we are reaching the point where most of those are k8s YAML definitions.
Also, I imagine that (almost) everyone is using a local Kubernetes cluster (k3d, KinD, Minikube, etc.) for development anyways, so using SchemaHero locally shouldn't be a problem.
All that being said, I think it mostly depends on the level of adoption one has of Kubernetes. If it's low, moving to SchemaHero might be premature. On the other hand, if the adoption is high and most of the stuff is already running in Kubernetes, using SchemaHero could result in a simplification of the process by relying on a single API (Kube API) to manage everything.
What I'm really trying to say is that I would not adopt SchemaHero if I would be someone at the beginning of the Kubernetes journey but, if I'm far off, sooner or later I'd want a consolidation under a single API.
@@DevOpsToolkit I’m not so sure about your last paragraph. I’ve been using k8s since 2015/16 and have deployed different applications from different languages (java, nodejs, puthon, etc) on it but I still don’t feel comfortable with SchemaHero. I don’t really see the point of making one’s DB migrations k8s dependent. Agree to disagree, I guess. 🙂
@@MarkMaglana Agree to disagree as well. Conversations like this are great. Among other things, it gives me much broader perspective and help me "tune" my views to be more in-line with what the wider industry is doing.
Cool finding! I found liquibase and flyway very useful if you use java application because the nature of the language and the easy integration. However I found this tool useful for other languages that don`t have this integration by default and need exterrnal database life cycle way
Liquibase has been a bit of a pain for us.. it often fails and leaves locks behind.. I wish it would fail a little better.
@@fpvclub7256 There are some alternatives to solve the locking problem using kubernetes with liquibase. Other alternatives such as flyway are language specific so it is less agnostic.
@@javisartdesign
Why do You think that FlyWay is language specific? Yes it can run Java migrations, but it can be used as a Docker images with SQLs files mounted as volume for any other languages App
@@alexandrulazarev6207 With flyway it's more complicate to support multiple databases at the sametime (oracle, postgres, etcc) so you have to include placeholders and conditions for statements. However with liquibase is capable to detect the engine and generate sql statements for that particular database being used.
I wish it supported elastic search 👀