With higher workload and the need for high availability and performance especially from geo located sites, the way forward for any Exchange Server 2013 setup is to configure Database Availability Groups (DAG) and mailbox copies. First of all to setup this you must make sure to meet the requirements which are either Exchange Server 2013 Standard or Enterprise (depending on how many databases you will have) and the two or more servers hosting the Exchange service must be the same OS level i.e. Standard or Enterprise and be updated accordingly with the same patches.
Each DAG must have a network dedicated for MAPI and another for Replication Networks. In Exchange multiple network are supported and a minimum of two networks is needed. Then depending on the flow of your emails according to the number of users you must cater for the hardware and bandwidth between the sites.
If you don’t comply with the requirements and you still go ahead with the setup, you might end up with issues like replication or health issues on your databases or replication. We will be looking in such issues but in particular with the problem of having the redundancy health check failing on Exchange Server 2013.
Further, health check fail could be a sign of damaged Exchange database. Thus, you must check the database status and repair it by using an Exchange recovery tool, such as Eseutil or Stellar Repair for Exchange.
Having an Exchange Server DAG setup with two servers running Mailbox, CAS and Hub roles which are geo located with different subnets connected over a site-to-site VPN, Windows Server 2012 R2 and Exchange Server 2013 Standard configured.
You might notice on the event logs a number of error messages in the Application log on MSExchangeRepl with event IDs 2059, 2153 and 4113. These are a result of some intermitted issues but will show quite often in the event logs.
Database redundancy health check failed.
Database copy: TempDB
Redundancy count: 1
Error: The number of configured copies for database ‘TempDB’ (1) is less than the required redundancy count (2).
Name Status RealCopyQueu InspectorQue ReplayQueue CIState
—- —— ———— ———— ———– ——-
DAG01 MBX Sto FailedAndSusp 111 0 0 Failed
re 1\EXC02 ended
DAG01 MBX Sto Mounted 0 0 0 Healthy re 1\EXC01
Troubleshooting the issue
First thing to check is to have an idea or maybe a log of changes on the Exchange Server on what happened from when the Exchange Server was healthy until the issue occurred so any changes made could identify the cause of the errors.
If there weren’t any changes that could harm the connectivity and the health of the replication I would personally start with the connectivity i.e. the speed of the connection and if there are any disconnection or lost packets during transit. Any changes on the network firewall or configuration changes.
One would need to check that the Active Directory Servers on both sides are functioning properly with the repadmin /replsummary to see that replication between sites is ok and then using the /queue to see if there are any blocked or unprocessed items. Also one would also run the command with the /showrepl parameter to see an overview of the Active Directory Partition. In conjunction with the above one could also check that the DNS is healthy and replicating between sites. Exchange Server is heavily dependent on your Active Directory Schema and if the AD and DNS are not healthy then you will surely have Exchange problems.
Another thing to take in consideration is the amount of mailboxes along with the size and the size of the actual mailbox database. It would be wise to split the users depending on their department or location. Another option would be to create a new database and start moving gradually the mailboxes to it. This would be ideal as if you have all your eggs in one basket i.e. all the mailboxes in one database and something should happen to that database, all your users will be affected.
Another option would be to re-seed the database but depending on the size and bandwidth, it will take a considerate amount of time to finalize.
One can also try to check the backup software being used. The backup software must be aware and compatible with Exchange Server DAG and before using any backup software one must ensure with the vendor that it’s compatible and DAG aware as using the wrong backup software can wreak havoc in your setup and create possible corrupted databases.
If you can run the PowerShell command Test-ReplicationHealth, if you notice an error on DatabaseRedundancy and DatabaseAvailaibity you would go through the setup and if you might be using one network card instead of the minimum of two, you need to have another network card. I need to stress the importance of not haggling with the requirements as there is a good reason for them.
To re-seed the other copy, open the Exchange Admin Center and navigate to Servers and Database. You will notice that the database Bad Copy Count is set to 1. Click on the database that has the issues.
On the database copy that is showing the issue click on the Update button and follow the wizard. If all goes well the database should re-seed and the issues should be resolved but be cautious in reseeding the database to first see the impact on the connection and the users.
If the problem persists and there might be occasions where this can fail, you can always rely on applications like Stellar Repair for Exchange to open corrupted EDB files, export to PST, directly to Live Exchange Server or Office 365 tenants.