Senior DBA Class High Availability and Disaster Recovery Part 1

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 พ.ย. 2024

ความคิดเห็น • 19

  • @malcomreynolds4103
    @malcomreynolds4103 2 ปีที่แล้ว +3

    Never been a fan of the fear many DBAs have around clustering. Not knowing how something works I think is a poor reason to fear it. If you have concerns, go lab it up. Put more RAM in your computer and make a little network and create yourself some clusters. If you don't have concerns, lab it up anyways. It is important to know how it works.
    While availability groups add complexity, it also removes many of the problems that failover cluster instances have around shared storage. They also provide you with way more tools and options to resolve problems. Yes, it makes you need to patch and support more servers - but you also don't have to patch or support them off hours anymore. When something goes down in the middle of the night, the listener fails over and you stay in bed until morning while the VM team bangs their head on what happened instead. The stress over not knowing whether a patch would create a problem in production that was fine in dev, is partially eliminated, you can install the patch on one secondary node and run all your most heavily used read only queries on the secondary node and have a pretty good production 'test' of the patch before you commit to making the patch permanent.
    Synchronizing jobs and logins, and managing jobs in HA is trivial with a tool like dbatools and ola hallengren maintenance solution. But even without those you can just add a step in change control to deploy changes to both nodes and allow regular operational jobs run and fail on secondaries without alerts and use an SSRS report to monitor their success on the active node, then write maintenance type jobs to gracefully terminate if they node they are running on is not primary.
    I have gotten a lot of pushback in some environments for the supposed complexity and amount of support in availability groups, but when I run across a problem in my stuff - if it is in maintenance I revert the problem and then go home at 5 PM. If it is something else that will take a long time to recover, I kill the problem node, go home at 5PM and then build a new node the next day. My coworkers instead start their maintenance at 9PM four times per month and if there is a problem on any of the servers they are maintaining during the window, they may work all night into the next day resolving it. If I need to restore from backup, I have backups available at a much higher frequency than the RPO because I have the resources to run them more frequently spread across multiple nodes. In all, I am spending less time supporting a four node availability group than some of my coworkers do supporting a single SQL server, and I spend next to zero of my personal time after hours supporting it.

  • @adammark48
    @adammark48 3 ปีที่แล้ว +2

    Great video! Love your sense of humour. I look forward to watching your other videos.
    My only request - please reduce sound effect volume.

  • @mazensharkawy9525
    @mazensharkawy9525 ปีที่แล้ว

    Top notch comedian and DBA at the same time!
    Great work man

  • @fernandomori9351
    @fernandomori9351 2 ปีที่แล้ว

    Great Ozar, always necessary information

  • @cwtrain
    @cwtrain 3 ปีที่แล้ว +1

    "There are a lot of features that they bring out and they *don't* invest in. We're not going to name any names..."
    _Ugh. Service Broker._
    "... Service Broker."
    _Ha! My man!_

  • @sleddy01
    @sleddy01 ปีที่แล้ว

    Really good information but all the comment pop ups were just too distracting for me.

    • @BrentOzarUnlimited
      @BrentOzarUnlimited  ปีที่แล้ว +1

      Sorry to hear that! This channel probably isn't for you, then.

  • @TheEnthu19
    @TheEnthu19 ปีที่แล้ว +1

    I just subscribed to you for "All of you are incompetent" :D at 16:08

  • @farisah
    @farisah 3 ปีที่แล้ว

    You can create a job that do a failover for other databases (AGs) in case of failover for one database (AG) on Standard/Enterprise Edition. You have to create an alert that triggers that job.

    • @BrentOzarUnlimited
      @BrentOzarUnlimited  3 ปีที่แล้ว +2

      By all means, go for it! Then open source it, publish it on Github, and other people can help you debug it. It's a lot trickier than it looks at first glance. (I've done it for clients and we've had to walk it back out due to problems with edge cases.)

    • @farisah
      @farisah 3 ปีที่แล้ว

      @@BrentOzarUnlimited Oh, that was bad. I had this done in lab for couple of weeks before I deployed it last year. It was tricky at first, the one I found. It only worked if you did a failoever on the first AG. It failed if you tried to failover other AGs! I createdd another procedure and it has worked since then.

  • @swetha908
    @swetha908 3 ปีที่แล้ว

    Love from india ❤️

  • @kaisirkhan5101
    @kaisirkhan5101 8 หลายเดือนก่อน

    No hands on