How to Fix – Failover Cluster Manager Disconnecting and Reconnecting – Event IDs 1146 and 1265?

Summary: Any unexpected and unplanned failover or a cluster shutdown may render your mailboxes inaccessible. In this post, we will be going through the failover cluster manager disconnecting and reconnecting issue, with Event IDs 1146 and 1265. We will also be discussing the possible solutions to troubleshoot such issues. In addition, you’ll find about an Exchange repair tool that can help recover mailboxes and other data from corrupt database.

Exchange Server has an excellent high availability system, with the Database Availability Group (DAG). In this article, we will be talking about two events in the Event Viewer – Event ID 1146 and Event ID 1265. Both events are related to the Cluster Service of Windows that controls the Exchange Server availability group.

Let?s take a detailed look at the mentioned events.

Event ID 1146 ? Microsoft Windows Failover Clustering

The Event ID 1146 is prompted with the following information:

?The Cluster Resource Host Subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource vendor.?

The message indicates that something related to the Cluster Resource Host Subsystem (RHS) has stopped working. To troubleshoot the issue, you need to first check which DLL file is causing the issue, apart from reporting this to your Exchange Server expert or supplier. You need to investigate more to understand the root cause of the problem.

From the error, it seems that the subsystem has crashed and an attempt to restart the subsystem will be initiated. This is usually with the recovery process or the resource is in deadlock.

In the cluster service, if the deadlock timeout (which by default is 20 minutes) passes, the Cluster Resource Host Subsystem (RHS) will deem the server as failed and force a failover process. If one of the nodes is down (with only one node in the cluster), the cluster will shut down for safeguard.

Event ID 1265 ? Microsoft Windows Failover Clustering

This event ID appears when there is a deadlock of the Cluster Resource Host Subsystem (RHS) or the DLL has crashed. The process at this stage will terminate. It?s a notification that the process has crashed and the Event ID 1146 will shut down the cluster.

Troubleshooting the Issue

Let?s discuss the process to troubleshoot the issue.

ERR   [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'ResourceName'.

INFO  [RHS] Enabling RHS termination watchdog with timeout 1200000 and recovery action 3.

ERR   [RHS] Resource ResourceName handling deadlock. Cleaning current operation and terminating RHS process

From the error, you can understand which exact resource is causing the issue. There could be issues with I/O or even with the performance of the storage. You can investigate the load on the disks. You can also investigate if there is a faulty hard drive in the RAID or a faulty hard drive was replaced but the rebuilding of the RAID is hindering the performance of the server.

To Conclude

You might have issues with the Exchange Servers, which form part of the cluster. After an abrupt restart, the services would not start or the databases will not mount due to corrupted transaction logs/database. The solution is to restore Exchange mailbox database. But this means an overhead to the department and data loss for the business.

You can recreate the Exchange Servers and use the Recovery Mode to retrieve all the configurations of the Exchange Server, while keeping the same computer name and IP address. The part of the Exchange Server Recovery can be done by using Stellar Repair for Exchange. This application allows you to open corrupt EDB files, from any Exchange Server version, with no size limit. You can browse the data stores and granularly export recovered database to PST, directly to a live Exchange Server database, or Office 365 tenant. The application is not limited to user mailboxes. It can also recover user archives, disabled mailboxes, shared mailboxes, and even public folders. Stellar Repair for Exchange will reduce the RPO and RTO. It will also reduce the risk of data loss and the time for recovery.

Related Post