Overview

The server status (shown below with red arrow) in the Cluster Info section in Live Monitor switches frequently between Red & Green if the alerting service for the AlwaysOn HA cluster isn't configured correctly.

Solution

Pre-Requisites

Microsoft SQL Server Management Studio (SSMS)

Diagnosis

Ask the customer to share the log file at /data/logs/services/always_on_monitor.log.

</supportagent>

Check the log file at/data/logs/services/always_on_monitor.log for following entries:

YYYY-MM-DD HH:MM:SS.sss ERROR AlwaysonMonitor(1): Error while fetching group database id: ('42000', '[42000] [FreeTDS][SQL Server]The user does not have permission to perform this action. (297) (SQLExecDirectW)')
YYYY-MM-DD HH:MM:SS.sss ERROR AlwaysOnMonitor(1): Problem determining current master: ('42000', '[42000] [FreeTDS][SQL Server]The user does not have permission to perform this action. (297) (SQLExecDirectW)')
YYYY-MM-DD HH:MM:SS.sss ERROR AlwaysOnMonitor(1): Failed to inform core about server role/type :  [INFO] The failover command did not change any role/role-setting.
YYYY-MM-DD HH:MM:SS.sss ERROR AlwaysOnMonitor(1): Problem determining health status from hadr_avail: ('42000', '[42000] [FreeTDS][SQL Server]The user does not have permission to perform this action. (297) (SQLExecDirectW)')

This indicates a permission issue for the ScaleArc user to query tables holding AlwaysON status, effectively meaning that these queries are not executing properly with the ScaleArc kerberized user:

SELECT database_name, group_database_id FROM sys.availability_databases_cluster WHERE group_id = <param> and database_name in <param>;
SELECT a.endpoint_url, a.replica_server_name, b.role, b.connected_state, b.operational_state FROM sys.availability_replicas a, master.sys.dm_hadr_availability_replica_states b WHERE a.group_id = b.group_id and a.replica_id = b.replica_id and a.group_id='<param>';

SELECT endpoint_url, replica_server_name from sys.availability_replicas WHERE replica_server_name = (SELECT primary_replica FROM sys.dm_hadr_availability_group_states);

Steps To Fix

You must ensure that the ScaleArc user has full access to the HA system tables (referenced in the queries above) to be able to determine the AlwaysON status and detect a role change in case an AlwaysON failover is performed from the database cluster side.

Log in to the database servers as an Administrator of the affected cluster with SSMS, and in the Object Explorer, navigate to Security > Logins for each database and make sure that the following permissions are enabled on the login properties of each ScaleArc account on the databases:

Connect to Any database
Create Any database
View Any database
View definition
View server state

Testing

You can test this by performing an HA failover and the alerts will clear out on all instances of the ScaleArc HA cluster. No new errors relating to this will be logged in /data/logs/services/always_on_monitor.log & the cluster remains in green as expected.

AlwaysOn Cluster Server Status Flipping Between Red & Green

Overview

Solution

Pre-Requisites

Diagnosis

Steps To Fix

Testing

Comments

Other articles in this section

Contact Details