General questions: * Set to asynch - what are the implications of this? * Upgrade secondary - so you can do this while the AG / DBs are running? It doesn't require downtime? what if the update requires a machine restart? What if the update is for Windows OS and that requires a restart? * failover to secondary now prompts to accept possible data loss - I presume this is because we set to asynch? What if this is unacceptable and we want to guarantee no data loss? Do we have to schedule an outage with users? * data movement was suspended (first time) - Why? Why do we re-start it before upgrading all replicas? If you had 20 DBs in the AG would you have to recommence data movement on each one separately? Doesn't this mean you will now be syncing from a higher version SQL Server to a lower version? Will this always work / be compatible?
First "Set to asynch" Usually set for the DR site where network latency is significant - If you don't use this setting, it can have impact on your primary, Asynch setting helps to let Primary know that Transaction is committed on Primary and let's move on to the next tran without caring if same tran is committed on Secondary replica. Synch settings will force Primary to wait until tran is committed on secondary replica which means if latency exist, your log will keep growing on primary and can go out of control for busy databases. Now the issue with this setting is "Data lose" if you failover to secondary, it is going to ask you always that you accept data lose, in my experience it is 3 to 4 second the max unless your network is really slow. If you don't want data lose, I will set this to Synch mode, while patching you don't need to do really anything as long as you are not going to have secondary replica down for more than hour (Caution, please check if primary has enough log space to compensate the log growth while primary accumulate for later commit once secondary comes online) Why DM was suspended (By design, Asynch + manual Failover) will cause db to suspend - you have to resume all of them manually, there are script that will look for any DM suspended status and resume on entire server in few seconds. please do research to get that TSQL or Powershell script. During the upgrade process, a secondary replica will not be available for failover or for read-only operations, and after the upgrade, it may take some time for the secondary replica to catch up with the primary replica node depending upon the volume of activity on the primary replica node (so expect high network traffic). Also be aware that after the initial failover to a secondary replica running a newer version of SQL Server, the databases in that Availability Group will run through an upgrade process to bring them to the latest version. During this time, there will be no readable replicas for any of these databases. Downtime after the initial failover will depend on the number of databases in the Availability Group. If you plan on failing back to the original primary, this step will not be repeated when you fail back. It is supported process from MS. I hope It make sense what I wrote?
is it compulsory to change synchronous commit to async, instead of that suspend data movement on secondary and apply patch on secondary?
General questions:
* Set to asynch - what are the implications of this?
* Upgrade secondary - so you can do this while the AG / DBs are running? It doesn't require downtime? what if the update requires a machine restart? What if the update is for Windows OS and that requires a restart?
* failover to secondary now prompts to accept possible data loss - I presume this is because we set to asynch? What if this is unacceptable and we want to guarantee no data loss? Do we have to schedule an outage with users?
* data movement was suspended (first time) - Why? Why do we re-start it before upgrading all replicas? If you had 20 DBs in the AG would you have to recommence data movement on each one separately? Doesn't this mean you will now be syncing from a higher version SQL Server to a lower version? Will this always work / be compatible?
All are great questions- let me give you all answers tomorrow in detail.
First "Set to asynch" Usually set for the DR site where network latency is significant - If you don't use this setting, it can have impact on your primary, Asynch setting helps to let Primary know that Transaction is committed on Primary and let's move on to the next tran without caring if same tran is committed on Secondary replica. Synch settings will force Primary to wait until tran is committed on secondary replica which means if latency exist, your log will keep growing on primary and can go out of control for busy databases. Now the issue with this setting is "Data lose" if you failover to secondary, it is going to ask you always that you accept data lose, in my experience it is 3 to 4 second the max unless your network is really slow.
If you don't want data lose, I will set this to Synch mode, while patching you don't need to do really anything as long as you are not going to have secondary replica down for more than hour (Caution, please check if primary has enough log space to compensate the log growth while primary accumulate for later commit once secondary comes online)
Why DM was suspended (By design, Asynch + manual Failover) will cause db to suspend - you have to resume all of them manually, there are script that will look for any DM suspended status and resume on entire server in few seconds. please do research to get that TSQL or Powershell script. During the upgrade process, a secondary replica will not be available for failover or for read-only operations, and after the upgrade, it may take some time for the secondary replica to catch up with the primary replica node depending upon the volume of activity on the primary replica node (so expect high network traffic). Also be aware that after the initial failover to a secondary replica running a newer version of SQL Server, the databases in that Availability Group will run through an upgrade process to bring them to the latest version. During this time, there will be no readable replicas for any of these databases. Downtime after the initial failover will depend on the number of databases in the Availability Group. If you plan on failing back to the original primary, this step will not be repeated when you fail back.
It is supported process from MS. I hope It make sense what I wrote?
@@TechBrothersIT great question and good answer it really helped
Pure gold man
+Rick Glorie Thanks!
Do we need to restart the servers after update and how?