How to Solve Exchange 2016 DAG failover

When having a Database Availability Group (DAG) within Exchange you have the peace of mind that your mailboxes are being replicated between sites and one of the servers goes down, users will be able to work with no manual intervention. Within a normal and healthy environment, doing a failover to another member in the DAG should be easy and should be tested from time to time. This will get you in shape and have a documented scenario of how to achieve business continuity if a disaster would strike.

But this is in the ideal scenario where everything works as it should be. In other cases one might have issues which you need to know how to handle or recover from. Most companies do test their failover once a year or sometimes in more frequent intervals. Failover testing isn’t something to only be tested. This can also occur during the monthly Windows updates as the primary server will need to be rebooted.

Taking Live Scenario:

Taking for example, two Exchange 2010 servers one is on the primary site being the preferred owner of the services and databases and the other being the secondary server with all the database copies on it. The primary site can go down for various reasons and an occasion would be a power failure or a generator test. Nothing could be effected but the Exchange Server powered off due to a misconfiguration in the hypervisor power options

Unfortunately during abrupt power cuts issues like no user can access their mailbox after the powering of the server can be encountered.

The first thing to check in such cases is the DNS records of the autodiscover service which should be pointing to both servers like the below example.

mail.mydomain.com of type A record pointing to 192.168.0.15
mail.mydomain.com of type A record pointing to 192.168.0.25
autodiscover.mydomain.com type A record pointing to 192.168.0.15
autodiscover.mydomain.com of type A record pointing to 192.168.0.25

Thou it should be a seamless failover, the users would get the error:

“Sorry we’re having trouble opening this item. This could be temporary, but if you see it again you might want to restart Outlook. Network problems are preventing connection to Microsoft Exchange”.

Nothing that as mentioned in many articles and knowledge bases the issue could be caused by the DAG taking a little bit of time until it affects a failover. If the problem still persists and you still get the same error and even from using OWA you get a similar error saying

“Your request can’t be completed right now. Please try again later”;

At this stage do not panic as there is always an alternative and a reason.

This first thing to check is the health of the cluster by using PowerShell for the cluster to see the status. Executed Get-ClusterNode and confirm that the primary server is shown as down and the secondary server is shown as up. Then one must also check the databases copies by using Get-MailboxDatabaseCopyStatus and if you might find the databases are not mounted and the Content Index State shown as failed. This immediately should strike a process to mount the databases but the databases might probably not mount manually as these failed to mount automatically. After the problem is fixed and the primary server is booted you might end up with a server not booting up or a corrupted operating system

Although this could be a disaster with your primary server not booting, there could be some way of restoring the business continuity. Since the cluster seems to have failed over to the secondary server why are the databases not mounting automatically? In such case one would have to open the Exchange Management Shell and used ESEUtil to identify any issues with the database as

eseutil.exe /mh "M:\mbx\mbx01"

and the first thing that one should notice is the database state. If it is in a Dirty Shutdown state these are signs that the database is corrupted. When having a DAG, if the cluster is not running and healthy, Exchange databases will not mount. So if the primary server is not booting and it is dead, one has to evict it from the cluster.

At this stage and a few hiccups along the way after restarting the secondary server the databases should be mounted with no issues. If the clients would still not be able to connect one would need to change the DNS record pointing to the primary server to point to the secondary. This would spare the trouble of going on each Outlook client and re-doing the profile. Although this is not the right way at least will get all the clients connected and able to access their mailbox until a plan to re-install a server are in place.

Since in most cases of having a secondary server, it means that the secondary server’s specifications would not be the same as the primary one since it’s there for disaster and barely used except to replicate data. You might notice frequent disconnections from the users to their mailboxes and in the server’s application log you may notice such errors like “The Exchange mailbox Server: [mail.mydomain.com] has reached its timeout threshold. The mailbox server will be protected from new requests for [60] seconds.” This error means that the server is too busy with requests that it needs to stop all requests until it’s free.

If you get at this stage, its most like to rush the plan to install a new Exchange Server and start transferring the mailboxes. One can export the mailboxes to PST using the PowerShell cmdlet New-MailboxExportRequest but since the only solution was to evict the server and the databases still think that there is a copy of the database from the failed primary server some errors can occur and you will not be able to export any PST files from your EDB file. Unfortunately you cannot just export directly from an EDB file without its parent Exchange Server online. Third Party Exchange Database Recovery software is best alternative to Solve Exchange 2016 DAG failover.

Luckily with Stellar Repair for Exchange you can resolve Exchange 2016 DAG failover and easily mount all the EDB files on the application.The quick scan you will be able to browse through the EDB database. Apart from being able to recover mailboxes from corrupt EDB files and export public folders to PST. You can export into the live Exchange database after a new DAG or Exchange Server is configured. This will save you a lot of time and it will get everybody working and minimizing downtime.

Comments(9)
  1. Ray Burner February 18, 2020
  2. Carlos January 15, 2020
    • Eric Simson January 15, 2020
  3. Mary blu January 8, 2020
    • Eric Simson January 8, 2020
  4. Charlie Darwin January 6, 2020
    • Eric Simson January 6, 2020
  5. Kelly Naoimi October 11, 2019
    • Eric Simson October 15, 2019

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.