HA Settings showing "ha service is stopped" on both ScaleArc nodes in a HA pair

Overview

Your ScaleArc HA Settings show that "ha service is stopped" for both primary and secondary nodes in the HA pair, causing a service outage.

HA_down.png
Restarting the HA service fails with "Error 5574 HA service started but the node was not online".
Clicking on the "Delink From Secondary" button delinks the secondary node but it still shows in the UI on the primary node with all the clusters still visible and handling traffic but with the "ha service is stopped" error still showing.

Solution

This issue occurs when the SQLite HA configuration database is corrupted in the primary node in the HA pair.

The solution in this scenario is to delink the HA and manually rebuild the clusters by following these steps:

  1. Delink the HA on both ScaleArc instances and configure them as standalone. See How to Force delink HA in ScaleArc v3.11.0.2 and later (Pacemaker based HA) if unable to successfully execute this step from the UI using the "Delink From Secondary" button.
    • Note that if there is a HA Flip-Flop situation, it is possible for the cluster information to disappear from the primary node. 
  2. Gather idblog_collector.sh logs and take a separate backup of the /system directory of both instances. See Capturing ScaleArc logs using idblog_collector.sh script for more information on this step.
  3. Manually rebuild the secondary node and re-create the clusters based on the following information:
    • The sqlite.bak file on the secondary node contains the VIPs
    • The /etc/freetds.conf on the primary node contains information on the database servers

Comments

0 comments

Please sign in to leave a comment.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request